Complex Differentiation #1

jrevels · 2018-02-22T20:21:21Z

Complex differentiation is one of those features that I had always planned on officially supporting in previous packages, but I never got around to grinding out the details/API consequences.

Since I see Capstan as my "fresh start" to AD now that Cassette is on the table, maybe it's time to dive into complex AD for real (pun intended).

Ideally, Capstan could eventually provide an API that supports differentiating:

non-holomorphic functions
C^N -> C^M functions
R^N -> C^M functions
C^N -> R^M functions

Relevant resources include https://arxiv.org/abs/0906.4835 (thanks for introducing me to this material, @ssfrr).

ChrisRackauckas · 2018-02-22T20:24:03Z

This is the #1 issue. 👍

antoine-levitt · 2018-02-23T10:01:45Z

Yes please!

I think in general complex numbers should be seen simply as a struct of two reals, and the backend/API should treat them as such (and therefore the jacobian of a C^N to C^M function is a 2N x 2M matrix). Particular cases where a simpler API would be useful:

complex-to-complex holomorphic functions, where a well-defined notion of (arbitrary order) derivative exists. This is tricky because intermediary steps can be non-holomorphic, and the only solution is either trust the other or to whitelist functions that are holomorphic (that excludes important functions such as real and imag that might be used to more conveniently express a function that is holomorphic in the end). Applications: implicit solving of z' = z^2, solution of the nonlinear system z^2 = 1.
complex-to-real-scalar gradient (and only gradient), see http://julianlsolvers.github.io/Optim.jl/latest/algo/complex/. Applications: optimizing abs(z - (1+im))^2.
real-to-complex functions, where arbitrarily high order derivatives are conveniently expressed just as in the real case but with complex numbers. I don't know of any particular application for real-to-complex differentiation though?

jrevels · 2018-03-06T17:18:22Z

@ssfrr and I met yesterday to discuss this, here are some notes from our discussion:

The most general form of a complex derivative of a ℂ → ℂ function can be written as a 2x2 real-valued Jacobian. From this notion, we determined that "correct" complex AD is given directly from "correct" real AD, assuming that all target programs still boil down to real-valued primitives. Properties such as holomorphism can lend extra structure to the aforementioned Jacobian, but it is not necessary to exploit that extra structure from a pure correctness standpoint.

Thus, it would be sufficient for Capstan to provide an API that accepts complex numbers, but only does real-valued AD under the hood (i.e. the Complex{Dual}-style approach).

However, things get more complicated once you start considering array primitives in reverse-mode AD. Here, you might want to define complex array primitives for all the same reasons that it's beneficial to define real-valued array primitives (e.g. to avoid unrolling long loops, to provide a hand-optimized implementation, the kernel for executing the primitive isn't Julia code, etc.). The question then becomes: What must be specified to define a complex array primitive for reverse-mode AD?

Naively, one might think they need to specify the aforementioned Jacobian for their primitive. While boiling down to real-valued differentiation like this is convenient to think about, one's ability to actually write performant code for such calculations will depend heavily on the memory layout of the input arrays. It's quite likely the arrays will be in array-of-struct format (i.e. Array{Complex}), at which point separate interactions/propagations with bulk real/imaginary components can become a huge pain.

To avoid this problem, we can design our API for defining complex-valued primitives to use Wirtinger derivatives instead of derivatives w.r.t. the real/imaginary components. Using Wirtinger derivatives also allows us to easily exploit certain properties for performance gains, for example, for holomorphic primitives, df/d(conj(z)) = 0, and for real-valued primitives with complex input, df/d(conj(z)) = conj(df/dz). These properties essentially allow us to leverage Wirtinger derivatives without paying any extra memory cost compared to formulating the AD on real/imaginary components separately.

Even with this choice, there are still some open design questions - mainly, how to handle dispatch for e.g. array primitives that have separate complex and real versions. Dispatching on element type can be a huge pain in general if it's mixed with container type dispatch, so my best idea at this point is to require that input arrays have eltype defined, and then dispatch on the result of eltype separately from the container. We'll see how it goes, though.

ssfrr · 2018-04-03T20:10:18Z

I put together a little demo implementation of Complex{Dual} perturbation seeding with test cases and a little f: C->R gradient-descent demo:

https://gist.github.com/ssfrr/6dcb548c06e18e54c35fc89874fad553

ssfrr · 2019-01-10T16:38:36Z

Some more possibly relevant discussion and info:

MikeInnes mentioned this issue Oct 26, 2018

Automatic differenciation for complex numbers FluxML/Zygote.jl#29

Closed

ssfrr mentioned this issue Apr 10, 2019

Complex Number Interfaces FluxML/Zygote.jl#142

Closed

ssfrr mentioned this issue Sep 3, 2019

Some thoughts on Wirtinger JuliaDiff/ChainRulesCore.jl#40

Open

nickrobinson251 mentioned this issue Apr 30, 2020

Complex numbers JuliaDiff/ChainRulesCore.jl#159

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complex Differentiation #1

Complex Differentiation #1

jrevels commented Feb 22, 2018

ChrisRackauckas commented Feb 22, 2018

antoine-levitt commented Feb 23, 2018 •

edited

Loading

jrevels commented Mar 6, 2018

ssfrr commented Apr 3, 2018

ssfrr commented Jan 10, 2019 •

edited

Loading

Complex Differentiation #1

Complex Differentiation #1

Comments

jrevels commented Feb 22, 2018

ChrisRackauckas commented Feb 22, 2018

antoine-levitt commented Feb 23, 2018 • edited Loading

jrevels commented Mar 6, 2018

ssfrr commented Apr 3, 2018

ssfrr commented Jan 10, 2019 • edited Loading

antoine-levitt commented Feb 23, 2018 •

edited

Loading

ssfrr commented Jan 10, 2019 •

edited

Loading