Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Api server thread clean shutdown #3348

Merged
merged 3 commits into from
Sep 20, 2023

Conversation

acatangiu
Copy link
Contributor

Changes

Use dedicated kill switch mechanism to break out of micro-http loop and gracefully shutdown the API server thread.

Remove hacky "/shutdown-internal" HTTP request which was externally exposing a way to bring down parts of firecracker components.

Dependencies

Ideally needs firecracker-microvm/micro-http#33 merged first, then point the dependency there.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license.

PR Checklist

  • All commits in this PR are signed (git commit -s).
  • If a specific issue led to this PR, this PR closes the issue.
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this PR.
  • New unsafe code is documented.
  • API changes follow the Runbook for Firecracker API changes.
  • User-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.
  • New TODOs link to an issue.

  • This functionality can be added in rust-vmm.

@acatangiu acatangiu changed the title Api kill switch Api server thread clean shutdown Jan 3, 2023
@JonathanWoollett-Light JonathanWoollett-Light added the Status: Blocked Indicates that an issue or pull request cannot currently be worked on label Feb 28, 2023
@JonathanWoollett-Light
Copy link
Contributor

Thank you for the contribution, looks like a good change.

@JonathanWoollett-Light JonathanWoollett-Light added Type: Enhancement Indicates new feature requests Priority: Low Indicates that an issue or pull request should be resolved behind issues or pull requests labelled ` labels Mar 24, 2023
@xmarcalx
Copy link
Contributor

xmarcalx commented Aug 21, 2023

Hi @acatangiu ,

Thank you very much for your contribution and sorry for the late turnaround.
The PR needs to be rebased and @ShadowCurse will perform the review including firecracker-microvm/micro-http#33.

@ShadowCurse
Copy link
Contributor

Hi @acatangiu do you want to rebase this PR on top of main or do you want me to do that?

@acatangiu
Copy link
Contributor Author

Hi @acatangiu do you want to rebase this PR on top of main or do you want me to do that?

Yes, feel free to do it, thank you!

@ShadowCurse ShadowCurse force-pushed the api-kill-switch branch 3 times, most recently from 96d8b6c to a79d06e Compare August 25, 2023 15:57
@codecov
Copy link

codecov bot commented Aug 25, 2023

Codecov Report

Patch coverage: 20.00% and project coverage change: +0.02% 🎉

Comparison is base (af054d7) 82.26% compared to head (9e9df32) 82.28%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3348      +/-   ##
==========================================
+ Coverage   82.26%   82.28%   +0.02%     
==========================================
  Files         225      225              
  Lines       28466    28457       -9     
==========================================
- Hits        23417    23416       -1     
+ Misses       5049     5041       -8     
Flag Coverage Δ
4.14-c7g.metal 77.68% <20.00%> (+0.02%) ⬆️
4.14-m5d.metal 79.59% <20.00%> (+0.02%) ⬆️
4.14-m6a.metal 78.69% <20.00%> (+0.02%) ⬆️
4.14-m6g.metal 77.68% <20.00%> (+0.02%) ⬆️
4.14-m6i.metal 79.57% <20.00%> (+0.02%) ⬆️
5.10-c7g.metal 80.60% <20.00%> (+0.02%) ⬆️
5.10-m5d.metal 82.25% <20.00%> (+0.02%) ⬆️
5.10-m6a.metal 81.46% <20.00%> (+0.02%) ⬆️
5.10-m6g.metal 80.60% <20.00%> (+0.02%) ⬆️
5.10-m6i.metal 82.24% <20.00%> (+0.02%) ⬆️
6.1-c7g.metal 80.60% <20.00%> (+0.02%) ⬆️
6.1-m5d.metal 82.25% <20.00%> (+0.02%) ⬆️
6.1-m6a.metal 81.46% <20.00%> (+0.02%) ⬆️
6.1-m6g.metal 80.60% <20.00%> (+0.02%) ⬆️
6.1-m6i.metal 82.24% <20.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
src/api_server/src/parsed_request.rs 93.04% <ø> (-0.08%) ⬇️
src/firecracker/src/api_server_adapter.rs 0.00% <0.00%> (ø)
src/api_server/src/lib.rs 85.71% <100.00%> (+4.91%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ShadowCurse ShadowCurse force-pushed the api-kill-switch branch 4 times, most recently from 1ad413e to 30fbaf4 Compare August 25, 2023 17:18
@ShadowCurse ShadowCurse added Status: Awaiting review Indicates that a pull request is ready to be reviewed and removed Status: Blocked Indicates that an issue or pull request cannot currently be worked on labels Aug 25, 2023
@ShadowCurse ShadowCurse force-pushed the api-kill-switch branch 4 times, most recently from a98a0d0 to cc2dea9 Compare September 4, 2023 08:52
wearyzen
wearyzen previously approved these changes Sep 15, 2023
src/api_server/src/lib.rs Outdated Show resolved Hide resolved
Added shutdown event-fd acting as a kill switch.
Used to signal the api server thread that it needs to shut down.

Signed-off-by: Egor Lazarchuk <[email protected]>
Co-authored-by: acatangiu <[email protected]>
Copy link
Contributor

@JonathanWoollett-Light JonathanWoollett-Light left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShadowCurse It is not clear to me why we need try_connect, what is the problem with using UnixStream::connect where we used it originally?

@ShadowCurse
Copy link
Contributor

@JonathanWoollett-Light I did not change the PR. I just rebased it on top of main and fixed merge conflicts.
Considerring try_connect is only used in one test, I will remove it.

@bchalios
Copy link
Contributor

Codecov Report
Patch coverage: 20.00% and project coverage change: -4.58% ⚠️

Is there anything we can do to fix this?

Use dedicated kill switch mechanism to break out of micro-http
loop and gracefully shutdown the API server thread.

Remove hacky "/shutdown-internal" HTTP request which was externally
exposing a way to bring down parts of firecracker components.

Signed-off-by: Egor Lazarchuk <[email protected]>
Co-authored-by:: acatangiu <[email protected]>
@acatangiu
Copy link
Contributor Author

@ShadowCurse It is not clear to me why we need try_connect, what is the problem with using UnixStream::connect where we used it originally?

Iirc I added this “try_connect” just for tests to make sure tests don’t sporadically fail based on different thread scheduling on different machines. At some point I know for sure tests would sporadically fail when test thread tried to connect to api thread before api thread had the chance to create the socket.

Not sure if this is still a problem anymore, but I would still keep “try_connect” as defensive programming.

Copy link
Contributor

@JonathanWoollett-Light JonathanWoollett-Light left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ShadowCurse ShadowCurse merged commit 3156667 into firecracker-microvm:main Sep 20, 2023
3 of 5 checks passed
@acatangiu acatangiu deleted the api-kill-switch branch September 21, 2023 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Low Indicates that an issue or pull request should be resolved behind issues or pull requests labelled ` Status: Awaiting review Indicates that a pull request is ready to be reviewed Type: Enhancement Indicates new feature requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants