Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constraint for state = available AND attempts = max_attempts #1207

Open
whatyouhide opened this issue Dec 24, 2024 · 2 comments
Open

Constraint for state = available AND attempts = max_attempts #1207

whatyouhide opened this issue Dec 24, 2024 · 2 comments
Labels
area:oss Related to Oban OSS kind:enhancement New feature or request note:discussion Details or approval are up for discussion

Comments

@whatyouhide
Copy link
Contributor

Hi @sorentwo 👋

Environment

  • Oban Versions
* oban 2.17.12 (Hex package) (mix)
  locked at 2.17.12 (oban) 7a647d6c
  ok
* oban_met 0.1.4 (Hex package) (mix)
  locked at 0.1.4 (oban/oban_met) b9579fdb
  ok
* oban_notifiers_phoenix 0.1.0 (Hex package) (mix)
  locked at 0.1.0 (oban_notifiers_phoenix) 7e820aea
  ok
* oban_pro 1.4.10 (Hex package) (mix)
  locked at 1.4.10 (oban/oban_pro) dfee951e
  ok
* oban_web 2.10.2 (Hex package) (mix)
  locked at 2.10.2 (oban/oban_web) e6a41b9c
  • PostgreSQL Version: AWS Aurora PostgreSQL, engine version 15.8

  • Elixir & Erlang/OTP Versions (elixir --version):

Erlang/OTP 25 [erts-13.2.2.6] [source] [64-bit] [smp:14:14] [ds:14:14:10] [async-threads:1] [dtrace]

Elixir 1.16.3 (compiled with Erlang/OTP 25)

What Happened

Last week, this happened in our Oban setup:

  • We ran a bad migration—a migration on our Aurora cluster that locked a big table for several minutes.
  • During that time, Oban jobs didn’t run because of Reasons™.
  • Once we aborted the migration, we went in to manually re-run failed Oban jobs. We should have used Oban.retry_all_jobs/1, but alas, middle-of-the-incident mentality! So, we manually updated the oban_jobs table and set the state of the jobs to retry to available.
  • These jobs were not picked up. I understand that that is due to state = available but attempts = max_attempts.

My question is: should there be an invariant in the system (in the form of a constraint?), where Postgres forbids you to set state to available if attempts = max_attempts?

@sorentwo
Copy link
Member

There used to be a constraint similar to that which caused problems when somehow the stager would try to mark jobs with attempt = max_attempts as available. When that happened it would crash the stager, which would then crash the Oban instance, and so on. That should never happen, and is probably not a concern now.

The attempt_range check could be modified to include the available state check. However, I wouldn't add a new migration (V13) for a circumstance that will only happen because of manual intervention.

@sorentwo sorentwo added kind:enhancement New feature or request note:discussion Details or approval are up for discussion area:oss Related to Oban OSS labels Dec 24, 2024
@whatyouhide
Copy link
Contributor Author

However, I wouldn't add a new migration (V13) for a circumstance that will only happen because of manual intervention.

Agreed, this is more like something to fix for new users or bundled together with future migrations and whatnot, nothing urgent.

That should never happen, and is probably not a concern now.

That's great to know 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:oss Related to Oban OSS kind:enhancement New feature or request note:discussion Details or approval are up for discussion
Projects
None yet
Development

No branches or pull requests

2 participants