-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
there is possible race condition with job start callback #93
Comments
This callback is added in nb2workflow by oda-hub/nb2workflow#135 |
It was also mentined here oda-hub/dispatcher-app#665 (comment) |
I think I see why the race condition happens:
I also tested this behavior locally, and I could see that the call-back is not started before the completion of the first This is my guess |
What can we do with it? I think of the lock file (won't help by itself, still good to have) and also the condition in the job manager that it can't "lower" the status |
How would you implement the lock? I guess it'd be on the file. So to make sure that, in relation to what described here, first the run_analysis completes, and then the run_call_back will execute? So we'd have a consistent sequence of states?
And about this, when do you see this needed? In the case the first |
I just tried again to track possible causes in the code (because I'm not 100% sure about it), but I get lost because I don't really understand a purpose and the logic of "job aliasing". Could someone explain?
Yes, this particular case. Probably, just one restriction that "progress" can't become "submitted" would be enough.
I just thought of using a library, like https://py-filelock.readthedocs.io/en/latest/index.html |
Another possible race condition: progress report callback may overwrite the "done" status. It's not fully confirmed, but it's possible and I suspect it may be the cause of frontend stuck in "progress" intil re-request with status "new". We observe this occasionally. |
Yeah, that's what I see. It looks very weird, there is a product modal, but the only product is "progress" which can be viewed. Eventually actual product also appears. |
ok ,good to know, I will also observe it and test locally |
oda-hub/dispatcher-app#670 is intended to fix this |
Let's close this until confirmed that it happens again. |
I will reopen, at least I see this on staging. Example - PhotoZ instrument, Run_phosphoros_basic, it doesn't internally report progress from the notebook, and the job is "submitted" but not "progressing" up until the notebook is completed (or failed). |
since this callback is sent very quickly after query is sent to the backend
reported by @dsavchenko
The text was updated successfully, but these errors were encountered: