-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complex Number Interfaces #142
Comments
To expand a little, This generalizes nicely for scalar and vector-valued functions, both
Nonholomorphic functions don't have derivatives so they'd throw an error. Checking for nonholomorphism is doable numerically but adds overhead: Given a function
and
We can check this in function isholo(f, z)
_, pb = forward(f, z)
du, = pb(1)
dv, = pb(im)
real(du) ≈ imag(dv) && real(dv) ≈ -imag(du)
end In the holomorphic case you only need to compute one of It's worth noting that this falls more elegantly out of the Wirtinger stuff that @jrevels and I have been banging at for a bit (see this issue for context, and Jarrett has implemented a lot of it in ChainRules). In the Wirtinger framework the |
Presumably this only applies when the function is not composed of any non-holomorphic functions (e.g.
This comment is concerning, IIUC; having separate adjoints for functions of real and complex types seems pretty undesirable to me and not overall that elegant, compared to just using the same adjoints everywhere and calculating the Wirtinger derivative separately if it's required. |
I saw @MikeInnes commented that ChainRules is in the pipeline #235 (comment) and @oxinabox is working on a PR #291 that does that. But can complex number handling be a blocker for #291 given that ChainRules supports complex differentiation differently? Or is ChainRules interface flexible enough such that Zygote can opt it out? |
This question is on my radar and has been for a while. |
I'm closing this for now as I think we're happy with how Zygote does complex AD (of course ChainRules integration is an open question, but we can discuss that as part of #291). |
Moving this discussion here from #29. We currently have a consistent way of treating complex numbers that's useful in the case of real-valued output (for gradient descent) but not always aligned with other notions of the complex derivative. At best it's only a conjugate away from a useful derivative, and at worst it's partial information (only one column of a 2x2 Jacobian).
@ssfrr suggests separating the
gradient
function from aderivative
function that uses the more traditional definition, and can also make numerical checks that the derivative is a valid operation.The text was updated successfully, but these errors were encountered: