Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server] Fix NPE when key value size profiling is not enabled #1378

Closed
wants to merge 1 commit into from

Conversation

sushantmane
Copy link
Collaborator

@sushantmane sushantmane commented Dec 10, 2024

Fix NPE when key value size profiling is not enabled

This commit addresses a NullPointerException in ServerHttpRequestStats when attempting to record key or value sizes while the corresponding profiling sensors (requestKeySizeSensor or requestValueSizeSensor) are null.

How was this PR tested?

UT

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

@sushantmane sushantmane enabled auto-merge (squash) December 10, 2024 19:21
Copy link
Contributor

@FelixGV FelixGV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines -448 to +450
requestKeySizeSensor.record(keySize);
if (requestKeySizeSensor != null) {
requestKeySizeSensor.record(keySize);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... but it's weird though... why is this sensor null?

The server config should control both whether this sensor is null, and also whether this function gets called or not 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is:

This function is only called from MultiGetResponseStatsWithSizeProfiling (and the equivalent compute-related class). And MultiGetResponseStatsWithSizeProfiling can only get constructed when serverConfig.isKeyValueProfilingEnabled() is true...

And AggServerHttpRequestStats also gets passed the same config, which (after a few indirections) should determine whether that sensor gets created or not.

So how can these be mismatched?

This comment was marked as outdated.

Copy link
Collaborator Author

@sushantmane sushantmane Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's coming from the following stack trace.

java.lang.NullPointerException: Cannot invoke "io.tehuti.metrics.Sensor.record(double)" because "this.requestValueSizeSensor" is null
	at com.linkedin.venice.stats.ServerHttpRequestStats.recordValueSizeInByte(ServerHttpRequestStats.java:452) ~[venice-server.jar:?]
	at com.linkedin.venice.listener.response.stats.ResponseStatsUtil.consumeIntIfAbove(ResponseStatsUtil.java:22) ~[venice-server.jar:?]
	at com.linkedin.venice.listener.response.stats.SingleGetResponseStats.recordMetrics(SingleGetResponseStats.java:30) ~[venice-server.jar:?]
	at com.linkedin.venice.listener.ServerStatsContext.recordBasicMetrics(

In SingleGetResponseStats we've this method:

  @Override
  public void recordMetrics(ServerHttpRequestStats stats) {
    super.recordMetrics(stats);

    ResponseStatsUtil.consumeIntIfAbove(stats::recordValueSizeInByte, this.valueSize, 0);
    stats.recordKeySizeInByte(this.keySize);
  }

IMO in the above NPE case SingleGetResponseStats::valueSize is set from one of these places where we don't check size profiling config is enabled or not:

Copy link
Collaborator Author

@sushantmane sushantmane Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is regression due to: #1377. It assumed that single get request stats has similar logic to multi-key requests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that. Any idea why this wasn't caught in any of the tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, right. I see it now: #1377 (comment)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation to disable KV size profiling for single gets?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metric bloat maybe? IDK. @nisargthakkar ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea why this wasn't caught in any of the tests?

Probably we don't have E2E a test which verifies all metrics for single-get request type?

@sushantmane sushantmane disabled auto-merge December 10, 2024 19:38
@sushantmane sushantmane enabled auto-merge (squash) December 10, 2024 20:08
@sushantmane sushantmane disabled auto-merge December 10, 2024 20:56
@sushantmane
Copy link
Collaborator Author

Closing this PR since @nisargthakkar updating KV size profiling behavior in #1381

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants