Add per model metrics #40

VedantMahabaleshwarkar · 2023-10-11T13:39:28Z

for step 2 of opendatahub-io/modelmesh-serving#228

Not sure why the unit tests from Openshift CI are failing on the PR. I manually verified that all tests are successful. To replicate, check out the PR branch VedantMahabaleshwarkar:metrics_cherrypick and run mvn -B package --file pom.xml to verify all unit tests pass.

TESTING INSTRUCTIONS :

Install ODH Operator 1.10.1 from operatorhub
Create the following kfdef
deploy a sample modelmesh model in any Data Science Project
Run some inference requests against the model
Navigate to Openshift Console -> Observe -> Metrics and run the following query and verify it returns the expected data
- sum(increase(modelmesh_api_request_milliseconds_count{vModelId='example-onnx-mnist'}[1m]))
- Note : Change vModelId=<your-model-name>

Ensure the query is visible without elevated permissions :

Create a regular user with edit and monitoring-rules-view permissions over the model namespace
Log in to the cluster using the regular user
Openshift Console -> Developer view -> Observe -> Metrics -> Custom Query -> Run sum(increase(modelmesh_api_request_milliseconds_count{vModelId='example-onnx-mnist'}[1m])) and verify it returns the expected data

- Add `modelId` parameter to `logTimingMetricDuration` function in `Metrics.java`: - `modelmesh_cache_miss_milliseconds` - `modelmesh_loadmodel_milliseconds` - `modelmesh_unloadmodel_milliseconds` - `modelmesh_req_queue_delay_milliseconds` - `modelmesh_model_sizing_milliseconds` - `modelmesh_age_at_eviction_milliseconds` - Add `modelId` parameter to `logSizeEventMetric` function in `Metrics.java`: - `modelmesh_loading_queue_delay_milliseconds` - `modelmesh_loaded_model_size_bytes` - Add `modelId` and `vModelId` param to `logRequestMetrics` in `Metrics.java`: - `modelmesh_invoke_model_milliseconds` - `modelmesh_api_request_milliseconds` Closes red-hat-data-services#60 Signed-off-by: Vedant Mahabaleshwarkar <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Prashant Sharma <[email protected]> Co-authored-by: Daniele Zonca <[email protected]> Co-authored-by: Nick Hill <[email protected]>

VedantMahabaleshwarkar · 2023-10-11T14:59:18Z

/retest

spolti · 2023-10-11T16:54:08Z

src/test/java/com/ibm/watson/modelmesh/ModelMeshMetricsTest.java

    }

-    @Test
+    @BeforeAll


If it is intended to be executed before the suite, it would be good to rename the method to something more accurate, e.g. prepareMetricsEnv

but the method itself, seems like a test scenario, not something to be executed before.

spolti · 2023-10-11T16:59:35Z

src/main/java/com/ibm/watson/modelmesh/Metrics.java

+        public void logTimingMetricDuration(Metric metric, long elapsed, boolean isNano, String modelId) {
+            Histogram histogram = (Histogram) metricsMap.get(metric);
+            if (perModelMetricsEnabled && modelId != null) {
+                histogram.labels(modelId, "").observe(isNano ? elapsed / M : elapsed);


It would be good to add a description for this label.

spolti · 2023-10-11T17:03:43Z

src/main/java/com/ibm/watson/modelmesh/Metrics.java

-
+            String methodName = idx == -1 ? name : name.substring(idx + 1);
+            if (perModelMetricsEnabled) {
+                modelId = Strings.nullToEmpty(modelId);


my 2 cents:
Here I would use Optional or Objects.toString(env, $desiredValue).
It will avoid using external libraries.

spolti

added a few comments, not mandatory though.
Thanks.

Jooho · 2023-10-11T17:47:19Z

/retest

Jooho

/lgtm

openshift-ci · 2023-10-11T19:57:54Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jooho, VedantMahabaleshwarkar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [Jooho,VedantMahabaleshwarkar]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot requested review from anishasthana and israel-hdez October 11, 2023 13:39

openshift-ci bot added the approved label Oct 11, 2023

VedantMahabaleshwarkar mentioned this pull request Oct 11, 2023

Upgrade MM to v0.11.0 in RHODS + Metrics hotfix opendatahub-io/modelmesh-serving#228

Closed

8 tasks

spolti reviewed Oct 11, 2023

View reviewed changes

Jooho approved these changes Oct 11, 2023

View reviewed changes

openshift-ci bot assigned Jooho Oct 11, 2023

openshift-ci bot added the lgtm label Oct 11, 2023

openshift-ci bot merged commit 2190b2e into opendatahub-io:release-0.11 Oct 11, 2023
3 checks passed

heyselbi linked an issue Oct 12, 2023 that may be closed by this pull request

Upgrade MM to v0.11.0 in RHODS + Metrics hotfix opendatahub-io/modelmesh-serving#228

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add per model metrics #40

Add per model metrics #40

VedantMahabaleshwarkar commented Oct 11, 2023 •

edited

Loading

VedantMahabaleshwarkar commented Oct 11, 2023

spolti Oct 11, 2023

spolti Oct 11, 2023

spolti Oct 11, 2023 •

edited

Loading

spolti left a comment

Jooho commented Oct 11, 2023

Jooho left a comment

openshift-ci bot commented Oct 11, 2023

Add per model metrics #40

Add per model metrics #40

Conversation

VedantMahabaleshwarkar commented Oct 11, 2023 • edited Loading

VedantMahabaleshwarkar commented Oct 11, 2023

spolti Oct 11, 2023

Choose a reason for hiding this comment

spolti Oct 11, 2023

Choose a reason for hiding this comment

spolti Oct 11, 2023 • edited Loading

Choose a reason for hiding this comment

spolti left a comment

Choose a reason for hiding this comment

Jooho commented Oct 11, 2023

Jooho left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Oct 11, 2023

VedantMahabaleshwarkar commented Oct 11, 2023 •

edited

Loading

spolti Oct 11, 2023 •

edited

Loading