-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add per model metrics #40
Add per model metrics #40
Conversation
- Add `modelId` parameter to `logTimingMetricDuration` function in `Metrics.java`: - `modelmesh_cache_miss_milliseconds` - `modelmesh_loadmodel_milliseconds` - `modelmesh_unloadmodel_milliseconds` - `modelmesh_req_queue_delay_milliseconds` - `modelmesh_model_sizing_milliseconds` - `modelmesh_age_at_eviction_milliseconds` - Add `modelId` parameter to `logSizeEventMetric` function in `Metrics.java`: - `modelmesh_loading_queue_delay_milliseconds` - `modelmesh_loaded_model_size_bytes` - Add `modelId` and `vModelId` param to `logRequestMetrics` in `Metrics.java`: - `modelmesh_invoke_model_milliseconds` - `modelmesh_api_request_milliseconds` Closes red-hat-data-services#60 Signed-off-by: Vedant Mahabaleshwarkar <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Prashant Sharma <[email protected]> Co-authored-by: Daniele Zonca <[email protected]> Co-authored-by: Nick Hill <[email protected]>
/retest |
} | ||
|
||
@Test | ||
@BeforeAll |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is intended to be executed before the suite, it would be good to rename the method to something more accurate, e.g. prepareMetricsEnv
but the method itself, seems like a test scenario, not something to be executed before.
public void logTimingMetricDuration(Metric metric, long elapsed, boolean isNano, String modelId) { | ||
Histogram histogram = (Histogram) metricsMap.get(metric); | ||
if (perModelMetricsEnabled && modelId != null) { | ||
histogram.labels(modelId, "").observe(isNano ? elapsed / M : elapsed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a description for this label.
|
||
String methodName = idx == -1 ? name : name.substring(idx + 1); | ||
if (perModelMetricsEnabled) { | ||
modelId = Strings.nullToEmpty(modelId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my 2 cents:
Here I would use Optional or Objects.toString(env, $desiredValue).
It will avoid using external libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a few comments, not mandatory though.
Thanks.
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Jooho, VedantMahabaleshwarkar The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
2190b2e
into
opendatahub-io:release-0.11
for step 2 of opendatahub-io/modelmesh-serving#228
Not sure why the unit tests from Openshift CI are failing on the PR. I manually verified that all tests are successful. To replicate, check out the PR branch
VedantMahabaleshwarkar:metrics_cherrypick
and runmvn -B package --file pom.xml
to verify all unit tests pass.TESTING INSTRUCTIONS :
sum(increase(modelmesh_api_request_milliseconds_count{vModelId='example-onnx-mnist'}[1m]))
vModelId=<your-model-name>
Ensure the query is visible without elevated permissions :
edit
andmonitoring-rules-view
permissions over the model namespacesum(increase(modelmesh_api_request_milliseconds_count{vModelId='example-onnx-mnist'}[1m]))
and verify it returns the expected data