Improve asymptotic efficiency of reverse-mode AD in interpreter #2187

athas · 2024-10-14T11:18:42Z

#2186 added AD to the interpreter. For reverse-mode, this is done by associating each value with a computation graph (really, DAG) that represents how the value was constructed. This is equivalent to the "Wengert tape" notion of AD. To compute the derivative, this graph is then traversed. Currently, the computation graph is represented as a tree, meaning that we redundantly recompute parts of it, which results in potentially exponential overhead. To fix this, we must exploit sharing inherent in the graph structure. The easiest solution may be to associate every node in the tree with a unique number, which during the traversal would let us recognise that we are re-visiting a part of the graph. Since this number must be globally unique, it must be maintained in the interpreter state.

I am also open to other ways to represent the graph, but this is the simplest one I could come up with.

athas · 2024-11-28T20:18:53Z

Note that this is a somewhat small project. If you are a student interested in doing this, then it is unlikely to fit more than a 7.5 ECTS POCS.

athas added enhancement student-viable Viable as a student project labels Oct 14, 2024

athas mentioned this issue Oct 14, 2024

Implement forward- and reverse mode AD in the interpreter #2186

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve asymptotic efficiency of reverse-mode AD in interpreter #2187

Improve asymptotic efficiency of reverse-mode AD in interpreter #2187

athas commented Oct 14, 2024

athas commented Nov 28, 2024

Improve asymptotic efficiency of reverse-mode AD in interpreter #2187

Improve asymptotic efficiency of reverse-mode AD in interpreter #2187

Comments

athas commented Oct 14, 2024

athas commented Nov 28, 2024