Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Qemu hang silently on failed boot #734

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

olethanh
Copy link
Collaborator

Ticket: JIRA-344

Problem:
When QEMU was failing to boot the hard drive file image provided by the user, for example we have cases of user using an ext4 image for firefracker
instead of a qemu disk image (this was facilitated by an oversight in the typescript sdk), the qemu process and hence the controller would hang indefinetly
without showing an error message.

Analysis

  1. the Boot process was not part of the logs or the process output. (even inside the server) which is part of what was making it hard to debug.
  2. QEMU try to boot via the network even if it is useless
  3. After failing all boot method the qemu process and thus the controller is still running indefinitely

Solution:
Change the option for qemu
-nographics make it output the boot process on the standard output (and thus the logs)
-boot order=c only boot the first hard drive (not sure if this actually
work)
-boot reboot-timeout=1 make it reboot if if fail to boot, but since we have -no-reboot the process just stop (default is -1 no reboot)

Ticket: JIRA-344

Problem:
When QEMU was failing to boot the hard drive file image provided by the user, for example we have cases of user  using an ext4 image for firefracker
instead of a qemu disk image (this was facilitated by an oversight in the typescript sdk), the qemu process and hence the controller would hang indefinetly
without showing an error message.

Analysis

1. the Boot process was not part of the logs or the process output. (even inside the server) which is part of what was making it hard to debug.
2. QEMU try to boot via the network even if it is useless
3. After failing all boot method the qemu process and thus the controller  is still running indefinitely

Solution:
Change the option for qemu
-nographics make it output the boot process on the standard output (and thus the logs)
-boot order=c only boot the first hard drive (not sure if this actually
work)
-boot reboot-timeout=1 make it reboot if if fail to boot, but since we have -no-reboot the process just stop (default is -1 no reboot)
@olethanh olethanh requested a review from nesitor December 19, 2024 09:02
Copy link

codecov bot commented Dec 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 62.38%. Comparing base (2f93e70) to head (b43300b).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #734   +/-   ##
=======================================
  Coverage   62.38%   62.38%           
=======================================
  Files          70       70           
  Lines        6235     6235           
  Branches      507      507           
=======================================
  Hits         3890     3890           
  Misses       2187     2187           
  Partials      158      158           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant