-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid assembling ghost row off-diagonals. #623
base: master
Are you sure you want to change the base?
Conversation
jenkins build this please |
for (unsigned dofIdx = 0; dofIdx < stencil.numDof(); ++dofIdx) { | ||
if (dofIdx > 0 && stencil.element(dofIdx).partitionType() != Dune::PartitionType::InteriorEntity) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break all discretizations besides cell-centered finite volume in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feared something like that... So what is the best solution? Make an cell-centered linearizer class that overrides these methods? I first experimented with modifying the stencil (which is specific to the discretization), but that did not work because it is used also to assemble the fluxes (obvious in hindsight...) which should be included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I think it might even not work for cell-centered:
If we are on an interior element, then we want to have the connections to other non-interior dofs in the sparsity pattern. But with you changes we do not get these are we. Hence even for cell-centered finite volumes we need to have an additional check whether the current element is in the interior and only if not execute your new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On that, I think you are not correct. The proposed code eliminates the ghost rows (except diagonal) but not ghost columns. The sparsity pattern created is not symmetric any more, note that the myIdx is the column number, and we do not eliminate based on that, but on the neighborIdx (using the corresponding element).
IMHO, these kind of optimizations are pretty much on the edge (are a bit over it) of the cost/benefit ratio. We are trying to quench out very little performance gains at the cost of generality and maintainability. Those made in flow are at least contained in the application space (but they did and are costing a lot of hours...), but here we are in a library and there needs to be a very clear benefit to justify increasing the maintenance burden or even loosing generality (actually the latter should never happen). |
I see your point. However, should not ghost row entries be eliminated like this in general? Even if the parallel linear operators can deal with it, they do not need it and gain from eliminating. And the preconditioners require us to eliminate their influence one way or another. So should we not rather try to generalize this, and get things to scale better with the other discretization? What I said above I think applies to ISTL at least. Perhaps this issue depends on what framework you are using, and that for PETSc for example a different way of thinking is needed or correct. That is in any case beyond my expertise at the moment. |
I misread the code (did not read that the stencil is column major and in my head I stayed in row-major mode). Yes, you are right that the code is correct for cell-centered finite volumes and one level of overlap. For all other cases it is not. The problem is that in general the handling of the processor boundaries depends on the discretization type and parallelization approach. There might also be preconditioners that depend on a symmetric sparsity pattern for performance/efficiency reasons. |
That indicates to me that the proper way of dealing with this is to have the generic code be "safe" (and not change it), while allowing a specific code (for a specific discretization) to override the behaviour. If I continue this work, I'll try to move it to a "ecfv" context. Thanks for pointing out that preconditioners might have their own requirements, do you know of any (currently implemented in istl or not) examples where this becomes an issue? In particular, I hope it is not an issue with AMG? |
Again: It depends on the discretization and the parallelization approach. Currently it will work with flow, but if you change the latter (and there have been discussions about that) it will not. |
The diagonal element will still be assembled.
28488de
to
4e51e8f
Compare
The diagonal element will still be assembled. This reduces the number of non-zero blocks in the matrix in parallel.