-
Notifications
You must be signed in to change notification settings - Fork 14
Jenkins idler is retaining stale data on active builds #143
Comments
@ldimaggi So here is the thing, where what happened with your builds in general? There is not even a completed build. Where are the march1test-1 and march1test-2 builds? They don't even seem to show up as completed builds. I think you reset the environment, right? Does this also reset pipeline builds? The logic in the Idler is expecting a specific flow a Build goes through and transitions the internal state accordingly. I am wondering whether the resetting of the environment basically screws up this state transitions so that the state we keep in memory is getting out of sync. @vpavlin what do you think? @ldimaggi @aslakknutsen I am not familiar with how the resetting of the environment works. What does it in terms of OpenShift actions? What happens with existing builds? What type of events (if any) would be oberservable? |
It's my understanding that the env reset removes the build configs and deploy configs - not sure what in addition to that. Wouldn't deleting the bc's and dc's remove all running builds? How can I clean up this situation today? I cannot see anything in OS O via oc. |
Sure
I think the problem is really on the Idler side. Not sure whether there is much you can do from your end right now. Restarting the Idler might help. As part of this issue, I am planning to add some sort of reset call which would allow to reset the state for a single namespace. This way if a namespace gets into a inconsistent state (in terms of the model the Idler build of it), there is an easy way to reset just this namespace. But all this required changes on the Idler code first. |
@hferentschik I am starting looking at this issue. After talking with @chmouel it seems that the approach to fix this is to watch for delete build event, and when this happens then we remove the data from idler so there is no more stale data on it. Do you think is the right approach to fix this, or you will prefer to have a |
Yeah, that makes sense, looking at the That should prevent weird behaviour is you are able to recognise the delete event and react appropriately. I'd still consider a Might be also useful to check what happens with Proxy - imagine there is a webhook buffered and you reset the environment - I am not sure if it will simply stay there and retry forever, or if it disappears (sorry, long time no see with the code:) ). Hope this helps |
yeah +1 on having a /reset may be a good idea anyway for ops and "Reset Env"! |
Ok, I will start with |
Now that I have implemented the The problem is that I am not really sure if this can be detected using the event. Let me explain why: Build events are thrown for any change that occurs on that object, the build object is specified at https://docs.openshift.com/online/rest_api/apis-build.openshift.io/v1.Build.html#object-schema and there is one field called status which you expect that there is what you need to check to know if it has been deleted or not. Then there is one field that it is called So my naive question is: Is enough to react to the canceled event? Since if the build is running and someone deletes it, then the build is canceled. If it is deleted when done, then we have already received the complete event so we should not modify anything. WDYT? |
Followup question - is the build automatically deleted when done? |
Nope, which is why we are able to see all pipeline runs in OSIO and OpenShift. |
If there is a build with phase |
Well why not, it is just the phase of deleting pipeline, but then the truth
is that we cannot listen to delete event at all
El mar., 24 jul. 2018 17:39, Kishan Sagathiya <[email protected]>
escribió:
… so there is no deleted phase
If there is a build with phase deleted, it isn't really deleted, is it?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#143 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABcmYTBO36Pku44gT4XmhQyvl-S9fJlIks5uJz-WgaJpZM4SaBXg>
.
|
I was able to reproduce this. |
So, all the build related things are stored in userIdler, which is stored in the memory. It takes some time for recent change to get reflected in userIdler. So, immediately after reset environments, if you call info api, it will get the old data that is stored in the memory. Given some time this should change to the current state. This issue is consistent and easily reproducible. After resetting the environment I saw old build data. I ran a new build and this is what I saw after that.
|
blocked on openshiftio/openshift.io#4356 |
Still blocked as prod-preview is down |
Not blocked anymore |
upstream issue filed for this openshift/origin#21112 |
Related to issue: openshiftio/openshift.io#2418
It appears that the issue reported in #2418 is caused by stale data relating to a completed build that is marked as active. The idler sees this build as active:
And - an active build prevents the idler from running.
But - no builds are active:
The text was updated successfully, but these errors were encountered: