-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vendoring: timeout in root remote rawhide int tests #24463
Comments
https://cirrus-ci.com/task/4973455794241536 Using 80fc34e I got us a server stack trace which I think is more interesting then the client one we get by default. goroutine 29 hangs on I see some other hang on the container lock in contianer list but that is expected as the one container is stucked is holding a container lock so no listing will work. |
Oh and for those who don't know rawhide is using composefs to test so my guess would be it has to do something with that. |
Is it possible that we are stuck in kernel-land? |
yes, that is what the stack trace would make me believe. mkdirat() should never take 3 mins I would say. And on that note I regret not taking a quick look in the journal before because this looks bad:
Sounds like we are hitting a bug in erofs in the kernel... I will check the other hang logs to see if we have the same trace there |
Confirmed we are hitting a kernel bug, I see the same trace in all linked logs. |
Good news, it passed in #24447 https://cirrus-ci.com/task/5246645728706560 These images have other known issues so we cannot just merge them unfortunately. Hopefully tomorrow we might can build new images that actually can pass CI so we can put this behind us. |
Ok passed again in my PR, https://cirrus-ci.com/task/5752469513306112 I think we should be able to get new images into CI by next Tuesday so I don't think we need to block the vendor dance on it |
#24447 passed CI and is merged so this can be closed |
Three people independently vendoring in
c-xxx
, all three are seeing test timeouts. E.g.,Am filing this as a central point for gathering logs and data.
The text was updated successfully, but these errors were encountered: