-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a generic evaluator function #306
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve, but I have some questions and a comment:
If I'm understanding correctly, the main advantage here is that you can string together an arbitrary number of EOS lookup functions under a single mpark::visit()
, right? This begs the question, is the mpark::visit()
function that impactful for performance? It doesn't get optimized away? Have you profiled this change?
As a tangent in defense of the current vector interface, it was never intended to be called on device. The predominant wisdom at the time it was written was that the only real way that the EOS would be called from device would be via the scalar interface and that the host could chain together whatever calls it needed. This duplicates the visit()
calls, yes, but otherwise allows the flexible parallelism you mention in your PR description.
EDIT - I may have come off too critical. I think this is a very useful addition and could be powerful in a number of different scenarios.
As an aside, does this functionality eliminate the need for EDIT - Or does it make sense to provide pre-made injectables rather than defining these within the EOS or base class? For example, we could use this to eliminate |
I know this isn't what you were intending here, but if you extend the existing vector interface to have a vector version of The other downside of this approach is that a fortran-based host code would have to provide its own C++ interface, which isn't something we're currently doing. |
My understanding is that the visit removes the possibility of vectorization. This lifts the visit higher up and thus in a lower level loop vectorization is possible. This basically allows downstream codes to write functors that can get at the underlying eos type and use that directly similar to how the vector interface accomplishes this. @jdolence has found this to have noticeable speedup. I am not clear on the main contributor to the speedup, that would be helpful to know though. As for the vector interface, it is still necessary and helpful as it provides concrete functions that can be called from fortran. |
I'm not going to review the code, since @jdolence and I came up with it over a whiteboard conversation. I might take a stab at adding a unit test, though, which I want before this gets merged. A few comments regarding the above discussion:
There are, I think, a few benefits:
It is worth noting that @jdolence only tested for ideal gas. It's possible our other EOS models are too complex to vectorize anyway. That said, we use ideal gas sometimes and being able to vectorize for it is probably a win.
I don't think we can eliminate the current vector interface at this time. I thought about rolling it into this API, i.e., redefining the vector valls to just call
No---because
This I am a fan of! Would love to have an injectable instead of using
This would be cool! I don't know if this kind of thing is possible with pybind. Maybe @rbberger knows? |
Sorry -- reread my PR description and I think it may have come off more critical than I intended of the current vector interface. There's nothing wrong with that interface, it just does a very specific thing that happens not to align with what I'm trying to do. As @dholladay00 points out, I'm not sure if it's the cost of the Another thing this PR enables are other kinds of operations that may or may not have anything to do with EOS in the same loop. For example, say you wanted to actually compute sound speed. You could write a functor that calls the EOS for bulk modulus and uses that result to compute the adiabatic sound speed. Really you can write whatever you want to execute in the functor just as you would anything else in your downstream code. And another thing this enables is for downstream codes to use their native data structures, if they desire. There may still be advantages to copying state out into a contiguous vector and then copying back results, but it's probably not always advantageous to do so. It could be that many of the EOSs won't vectorize anyway, so the only effective difference from just calling the EOS functions on the variant is that you call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs tests. But otherwise LGTM. Pleased how this came out. :D
Thanks for answering my questions @dholladay00 @Yurlungur and @jdolence ! I think I understand this better now. I think this is ready to be merged. I'd like to also decrease the number of overloads we have on the vector interface in the future and this might be a tool to do so. I'll have to give it more thought though. |
I'd be interested to see if this improves performance in places where consecutive eos calls are present such as in get_sg_eos and in the PTE solvers. |
Thanks for the reminder!. A really simple test should cover it and then it can be merged |
Can't be used in the PTE solvers yet. We'd need a vector solver. |
But it can be used where we request multiple things from an EOS in quick succession (like within a PTE solver) |
I am thinking specifically in material loops there are several places where multiple eos calls are made per loop iteration, the evaluate could be used in such a way as to only require 1 visit per loop iteration. This would only be helpful if visits themselves are expensive. It's possible there are situations where they may be if, for example, every visit consumes a lot of registers. |
Ah I see. Yes that's true. That might buy us something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a stab at documentation and tests. Tests triggered on Darwin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the documentation and test @Yurlungur ! This looks good to me other than a minor change to the docs
Tests now pass on darwin. As soon as tests pass on github we can merge. |
PR Summary
As discussed with @Yurlungur...
This is a tiny PR that adds a fair bit more flexibility to how downstream codes interact with singularity-eos. The new functionality is a new
Evaluate
template function for bothEosBase
andVariant
that can take in a downstream-defined functor and evaluate it with the underlying eos object. This allows one to, for example, express a parallel loop that calls several EOS functions at once in the downstream code, and then have singularity execute that parallel loop with one of the basic EOSs (as opposed to having to visit the variant in every call).The current vector interface is quite constraining. For example, if you're running on GPU, it assumes you're calling from the host and that you want to launch a new kernel to make the eos call. In our downstream code, we use hierarchical parallelism where what we really want is to have an inner loop inside our kernel that calls the EOSs. This new feature allows us to write our inner loop as we do all our other inner loops, pass it in to singularity, and have it execute as we expressed but using the CRTP stuff to avoid many calls to
mpark::visit
.I think the python interface requires something like the existing vector interface, otherwise I would argue that this is much more flexible and may be preferred in general for interfacing with downstream codes. The only downside is that it requires the downstream code to write a functor (or some number of functors to capture all the EOS calling patterns).
PR Checklist
make format
command after configuring withcmake
.