Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck after reboot #1589

Open
jknipper opened this issue Nov 26, 2024 · 3 comments
Open

Stuck after reboot #1589

jknipper opened this issue Nov 26, 2024 · 3 comments
Labels
kind/bug Something isn't working platform/VMWare

Comments

@jknipper
Copy link

jknipper commented Nov 26, 2024

Description

From time to time we see servers stuck in boot process after reboot. This only happens in rare cases after an OS update was applied.

Impact

The server is stuck after reboot and needs to be rebooted manually a second time to bring it up.

Environment and steps to reproduce

  1. it's virtual machines and we are using VMware as a hypervisor
  2. automatic reboot after OS update
  3. we couldn't reproduce the behavior yet and it happens very rarely

Expected behavior

The machine boots without interruption.

Additional information

From the attached screenshot of the server console it seems that the boot process got stuck while trying to mount sysroot.

Screenshot from 2024-11-26 12-22-17

This issue started some time ago and is hard to debug for us. Any suggestions how we could investigate further in this matter are greatly appreciated!

@tormath1
Copy link
Contributor

Hello @jknipper, thanks for raising this. Are you able to tell us since when (which release) did you notice this issue? It seems to be an issue with the disk, do you have a specific configuration for your instances?

@jknipper
Copy link
Author

From what I can see in our alerts/logging it started in July or August on a regular basis. We are running the stable release and are updating all instances shortly after a new release is published. It's hard to see if this is on the hypervisor side or an issue with flatcar. There was no real change on how the instances are provisioned, maybe on VMWare side, I'll try to find out.

@tormath1
Copy link
Contributor

All the instances reboot at the same time on the hypervisor?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working platform/VMWare
Projects
Status: 📝 Needs Triage
Development

No branches or pull requests

2 participants