-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complex Differentiation #1
Comments
This is the #1 issue. 👍 |
Yes please! I think in general complex numbers should be seen simply as a struct of two reals, and the backend/API should treat them as such (and therefore the jacobian of a C^N to C^M function is a 2N x 2M matrix). Particular cases where a simpler API would be useful:
|
@ssfrr and I met yesterday to discuss this, here are some notes from our discussion: The most general form of a complex derivative of a ℂ → ℂ function can be written as a 2x2 real-valued Jacobian. From this notion, we determined that "correct" complex AD is given directly from "correct" real AD, assuming that all target programs still boil down to real-valued primitives. Properties such as holomorphism can lend extra structure to the aforementioned Jacobian, but it is not necessary to exploit that extra structure from a pure correctness standpoint. Thus, it would be sufficient for Capstan to provide an API that accepts complex numbers, but only does real-valued AD under the hood (i.e. the However, things get more complicated once you start considering array primitives in reverse-mode AD. Here, you might want to define complex array primitives for all the same reasons that it's beneficial to define real-valued array primitives (e.g. to avoid unrolling long loops, to provide a hand-optimized implementation, the kernel for executing the primitive isn't Julia code, etc.). The question then becomes: What must be specified to define a complex array primitive for reverse-mode AD? Naively, one might think they need to specify the aforementioned Jacobian for their primitive. While boiling down to real-valued differentiation like this is convenient to think about, one's ability to actually write performant code for such calculations will depend heavily on the memory layout of the input arrays. It's quite likely the arrays will be in array-of-struct format (i.e. To avoid this problem, we can design our API for defining complex-valued primitives to use Wirtinger derivatives instead of derivatives w.r.t. the real/imaginary components. Using Wirtinger derivatives also allows us to easily exploit certain properties for performance gains, for example, for holomorphic primitives, Even with this choice, there are still some open design questions - mainly, how to handle dispatch for e.g. array primitives that have separate complex and real versions. Dispatching on element type can be a huge pain in general if it's mixed with container type dispatch, so my best idea at this point is to require that input arrays have |
I put together a little demo implementation of https://gist.github.com/ssfrr/6dcb548c06e18e54c35fc89874fad553 |
Some more possibly relevant discussion and info: |
Complex differentiation is one of those features that I had always planned on officially supporting in previous packages, but I never got around to grinding out the details/API consequences.
Since I see Capstan as my "fresh start" to AD now that Cassette is on the table, maybe it's time to dive into complex AD for real (pun intended).
Ideally, Capstan could eventually provide an API that supports differentiating:
Relevant resources include https://arxiv.org/abs/0906.4835 (thanks for introducing me to this material, @ssfrr).
The text was updated successfully, but these errors were encountered: