-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Err stats #76
base: develop
Are you sure you want to change the base?
Err stats #76
Conversation
572817a
to
b2dbb6f
Compare
@@ -198,7 +211,9 @@ class AMSWorkflow | |||
#ifdef __ENABLE_MPI__ | |||
comm(MPI_COMM_NULL), | |||
#endif | |||
ePolicy(AMSExecPolicy::AMS_UBALANCED) | |||
ePolicy(AMSExecPolicy::AMS_UBALANCED), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koparasy fix this before we land it to develop.
@JaeseungYeom Thank you for the PR! I am afraid this version has too much data movement across ranks. The predicate array although boolean can be fairly large in an actual large simulation. Gathering and Scattering them across the distributed system will be a bottleneck. How about this solution here:
The algorithm probably is not correct but I tried to use to the extend possible meaningful variable names for you to understand. The concept though is to limit the number of data you send over the network. In this code I have 1 Gather and 1 Scatter. The size of the message increases linearly to the number of ranks. Instead you code has 1 Gather, 1 Scatter, 1 GatherV, 1 ScatterV. Gatherv and ScatterV may end up sending a lot of data! Which we cannot sustain in the inner loop. Let me know if I can further help. Thank you! |
I got the idea. It is an excellent suggestion. I will implement something along the line of this. |
91d1160
to
7e6584d
Compare
7eacdb6
to
5517f92
Compare
5517f92
to
1fad494
Compare
1fad494
to
695ad26
Compare
Taking advantage of idle rank due to load imbalance for computing extra physic model outputs such that those can be compared against surrogate outputs.
Currently, average and variation of the error of surrogate outputs of chosen validation points are reported per evaluation iteration.