-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Adjoints to be ComponentArrays #170
Open
nrontsis
wants to merge
3
commits into
SciML:main
Choose a base branch
from
nrontsis:patch-2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 2 commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and in many other tests, I think we were testing the wrong behaviour: I think we should have used
*
instead of.*
(like this PR does)..*
is different than*
in that it performs broadcasting on both sides and then multiplies elementwise. However, the axis of the (broadcasted) left and right hand side are not identical, so a non-ComponentMatrix must be returned.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadcasting on GPU is a behaviour that we particularly want to test.
The axes of the two sides are not required to be identical in broadcasting (and in general).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right about this test; I have thus restored the diff here.
However, in other places like in this one, the testing logic was actually incorrect. In this particular example, doing broadcasting should not yield the axes that the test was specifying, as detailed in my comment above.
To correct these, I could either change the operation to be matrix multiplication (instead of broadcasting), or change the testing logic. I chose the former, but happy to change if you think otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadcasting should extrude singleton dimensions to the non-singleton axes of that dimension within the broadcasted operation. We're following the same behavior here that the other array packages follow:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the case of a
::AbstractVector .* ::Adjoint{_, <:AbstractVector}
, it just happens to give the same result as plain matrix multiplication.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me what the
StaticArrays
example demonstrates? For example, I might argue that the result of the example withStaticArrays
is compatible with the current behaviour of this PR, as the indices ofb
anda
are already compatible, in the sense that:gives a
3×3 SMatrix{3, 3, Int64, 9} with indices SOneTo(3)×SOneTo(3)
asa .* b'
does.Also, in my understanding the broadcasting behaviour of
master
is inconsistent (essentially a special case only for transposes?):In general, I find it hard to imagine a way to implement the broadcasting described in way that will result in a consistent operation both for multiplying
2x1
.*1x1
matrices and1x1
.*1x1
matrices. If we want to continue this chat, it would help me a lot if we could clarify the behaviour of:as I find the result of
master
on these inconsistent.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
StaticArrays
example demonstrates thatBroadcast.combine_axes
is choosing the non-singleton axes of each dimension from the arrays it's given rather than falling back to the defaultBase.OneTo
axes.There's kinda a complicated story to why we have to special case vector transposes. Before
ArrayInterfaces.jl
existed, the wayDifferentialEquations.jl
and some other packages created matrix cache arrays from vectors (say, for a Jacobian cache) was to callu .* u' .* false
. NowArrayInterfaces.jl
lets you overload azeromatrix
function for this purpose. We want the cache matrices to in these cases to beComponentMatrix
so users that are interacting with them directly can address blocks by named indices. We could probably at this point switch to usingzeromatrix
. It may break things for some packages that are still doingu .* u'
to expand matrices, though, so I've kinda put off the idea for fear of having to deal with that.But with that said, I actually want the current behavior we have for the vector and vector transpose case to be the behavior for all cases. Unfortunately its just inherently type unstable to do this in general because it would mean that the size of a dimension (which generally isn't known at compile time) would determine the type of axes we choose. Consider the following examples (I'm using vectors instead of matrices here because it makes the issue clearer and removes the temptation for us to think of this as having anything to do with matrix multiplication):
Within each group, the argument types are the same, so for type stability, we need to have the return types the same as well.
In Group 1, Case 1 (I'll just abbreviate it to G1C1), it's clear that we should want use the axis of the
ComponentVector
, since the regularVector
has a single element and will thus be extruded into the axis of the other array for that dimension. If we want to have this, however, we also need to make it so theComponentVector
axis "wins" for G1C2, since the compiler doesn't know how many elements the plainVector
has.In G2C1, we're faced with a similar choice and it seems like we should similarly let the
ComponentVector
"win" here for consistency. But clearly we can't do that because in G2C2, which looks the same to the compiler as G2C1, we will be extruding the single elementComponentVector
onto the axes of the plainVector
. So for this group, we have to let both results fall back to giving a plainVector
as an output, despite the fact that we would probably prefer to have the G1C1 be aComponentVector
for consistency.For inspiration into how it should be handled, I think we should look back at
StaticArrays
and see how they do it:Since
StaticArrays
have knowledge of their size, they make the decision to always use the static array axis when the length of the dimension of theStaticArray
is> 1
and always use the other array's axis when it's== 1
. This makes sense and probably gives the best tradeoff between consistency and usefulness.Could we do the same thing for
ComponentArrays
? Well... sorta. In most usage ofComponentArrays
, the named indices correspond 1:1 to the array elements so we can statically know the size of theComponentArray
by looking at the last index of each dimension of the component axes. But there is this undocumented but still very useful use ofAxes
that don't follow this behavior (see #163). So we can't necessarily guarantee that a dimension of aComponentArray
's is length 1 just because it's component axis type information seems to indicate it is. But maybe we could just assume they are and throw an error in cases where this isn't true?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation.
This sounds ideal, as I am all in to eliminating special cases. Assuming that
u .* u'
(.*false
?) is used to expand matrices in some packages (that we don't test against?) sounds fragile to me and, as far as I understand, it already "fails to work" in very popular packages that I happen to use:The potential behaviour that you described seems complete and well thought (at a first look), but I am of the opinion that unless or until we have a complete and documented behaviour the resulting complexity and confusion will make the simplicity of a uniform rule like "broadcasting results in flataxis" (as per this PR) will be preferable, and that's before taking maintenance overheads into mind.
Nevertheless, I think this chat might be expanding beyond the scope of this PR. If you want to maintain the special case for transposed broadcast, I will include it in this PR. I tried to do that today by adding the following lines in
broadcasted.jl
:to substitute the lines in that file currently deleted by this PR. This works for
ComponentVector(a=0) .* ComponentVector(b=0)'
but not for
false .* ComponentVector(a=0) .* ComponentVector(b=0)'
.I failed to find why it does not work (while it works in
master
) even after spending a couple of hours on this today, so I would appreciate some help on this, assuming that's the route we are taking.