-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP/PROTO: Consider RNDV_PERF_DIFF #10292
base: master
Are you sure you want to change the base?
UCP/PROTO: Consider RNDV_PERF_DIFF #10292
Conversation
@yosefe @brminich currently patch improves all RNDV protocols, but maybe it worth to think about applying this change to RNDV-get/put only since there can be cases when RMA is unsupported for some reason and on big sizes there would be comparison between Eager and RNDV through AM and BW of latter would be slightly higher due to |
I’d leave it for all protocols for simplicity. Additionally, we have PPLN protocols, and even AM-based RNDV may be preferable to eager due to its lower memory consumption. |
I thought AM-based RNDV and PPLN protocols consume same amount of memory as eager, aren't they? Do AM-based protocols send message directly to user-provided buffer? |
In case of tag matching API eager messages are queued on the receiver until receive operation is invoked. Such eager fragments may consume significant amount of memory on the receiver. With RNDV data transfer is started only when receiver is ready |
src/ucp/proto/proto_perf.c
Outdated
@@ -430,6 +430,27 @@ ucs_status_t ucp_proto_perf_aggregate2(const char *name, | |||
return ucp_proto_perf_aggregate(name, perf_elems, 2, perf_p); | |||
} | |||
|
|||
void ucp_proto_perf_apply_bias(ucp_proto_perf_t *perf, double bias) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/ucp/proto/proto_perf.c
Outdated
ucp_proto_perf_factor_id_t fid; | ||
ucp_proto_perf_segment_t *seg; | ||
|
||
if (bias == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compare with some epsilon value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ucp_proto_perf_segment_foreach(seg, perf) { | ||
for (fid = 0; fid < UCP_PROTO_PERF_FACTOR_LAST; ++fid) { | ||
seg->perf_factors[fid] = ucs_linear_func_compose( | ||
bias_func, seg->perf_factors[fid]); | ||
ucp_proto_perf_node_update_factors(seg->node, seg->perf_factors); | ||
} | ||
bias_node = ucp_proto_perf_node_new_data("bias", "%.2f %%", bias); | ||
ucp_proto_perf_node_own_child(seg->node, &bias_node); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we implement this using ucp_proto_perf_add_funcs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think that this would be more laconic/readable since ucp_proto_perf_add_funcs
is designed to sum up one factors with another ones on some range, so:
- We need to sumehow replace sum logic by multiplication
- We still need to do that for each segment since doing it from 0 to SIZE_MAX can expand perf layout
What?
Applies
UCX_RNDV_PERF_DIFF
setting effect for protov2.Why?
This control effect was removed during introducing perf-factors logic. This PR brings it back.
That how it looks like with that patch: