Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsc: Consider job before blindly caching #1626

Merged
merged 6 commits into from
Aug 8, 2024
Merged

Conversation

V-FEXrt
Copy link
Contributor

@V-FEXrt V-FEXrt commented Aug 7, 2024

There are a few reasons do have a precheck before pushing a job to the cache

  • A job might become cached while being ran, uploading it is then a waste of resources (original purpose of this PR)
  • A job might not be all that useful to cache so its eaiser to just skip/rebuild later
  • The server may be overloaded and would prefer less requests, this provides an opportunity for it to shed load

This PR implement those reasons into a route that the client should check before pushing a new job to the cache

Copy link
Contributor

@colinschmidt colinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the thinking is about when we want to see the different exits from allow_job in the server log.
We won't be able to record how many jobs were too short to be included or how many were duplicates without some sort of breadcrumb. However for the too short exit condition we would end up spamming the log since all of those jobs will no longer be cached. The duplicate case seems potentially useful to log.
Finally in the high load case we are probably doing the right thing by not logging exactly when we shed a request just that we are very likely too at the moment.
So I think the tracing summary is I would want to see an info/warn trace on the CONFLICT status return.

rust/rsc/src/bin/rsc/read_job.rs Show resolved Hide resolved
rust/rsc/src/bin/rsc/read_job.rs Outdated Show resolved Hide resolved
rust/rsc/src/bin/rsc/types.rs Outdated Show resolved Hide resolved
@V-FEXrt
Copy link
Contributor Author

V-FEXrt commented Aug 8, 2024

We won't be able to record how many jobs were too short to be included or how many were duplicates without some sort of breadcrumb. However for the too short exit condition we would end up spamming the log since all of those jobs will no longer be cached

We could maybe a few more columns to the job history table. Then we'd have hits, misses, denied, conflicted, load_shed or something along those lines. We already have the hash here so it doesn't cost much (besides slightly more load and extra disk) to track

Finally in the high load case we are probably doing the right thing by not logging exactly when we shed a request just that we are very likely too at the moment.

Yeah I want to avoid too much extra work when under load. I'm actually somewhat tempted to temper the warning even further since if we are under load that print is likely be triggered a very large amount. Tempted to do something like

if shed_chance > 0.5 && gen_bool(0.1) { LOG }

which would have a 10% chance of showing the the "heavy load" log. Another improvement would be to add a dashboard graph for "chance to shed load"

So I think the tracing summary is I would want to see an info/warn trace on the CONFLICT status return.

On it

@colinschmidt
Copy link
Contributor

DB tracking for these exit conditions sounds great and shouldn't be too expensive 🤞
Yeah only showing the load shedding message sometimes sounds good to me.

Copy link
Contributor

@colinschmidt colinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@V-FEXrt V-FEXrt merged commit 177614b into master Aug 8, 2024
10 checks passed
@V-FEXrt V-FEXrt deleted the rsc-job-check-before-upload branch August 8, 2024 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants