Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neonvm: Add controller flag to disable runner cgroup #1034

Merged
merged 3 commits into from
Aug 29, 2024

Conversation

sharnoff
Copy link
Member

This commit adds new flags to neonvm-controller and neonvm-runner:

  • controller: -disable-runner-cgroup
  • runner: -enable-dummy-cpu-server

If the controller is passed -disable-runner-cgroup, then it will pass both -skip-cgroup-management and -enable-dummy-cpu-server to new runner pods.

If neonvm-runner is passed -enable-dummy-cpu-server (requires the -skip-cgroup-management flag), it will not run QEMU in a cgroup, but still provide an HTTP server with endpoints that pretend as if the CPU limit was updated successfully.

Internally, the runner's "dummy" implementation still needs to store the most recently set value to provide it back to the controller, so that it doesn't infinitely loop trying to set the CPU.

ref https://neondb.slack.com/archives/C06SJG60FRB/p1723485730034399

This commit adds new flags to neonvm-controller and neonvm-runner:

* controller: -disable-runner-cgroup
* runner: -enable-dummy-cpu-server

If the controller is passed -disable-runner-cgroup, then it will pass
both -skip-cgroup-management and -enable-dummy-cpu-server to new runner
pods.

If neonvm-runner is passed -enable-dummy-cpu-server (requires the
-skip-cgroup-management flag), it will *not* run QEMU in a cgroup, but
still provide an HTTP server with endpoints that pretend as if the CPU
limit was updated successfully.

Internally, the runner's "dummy" implementation still needs to store the
most recently set value to provide it back to the controller, so that it
doesn't infinitely loop trying to set the CPU.

ref https://neondb.slack.com/archives/C06SJG60FRB/p1723485730034399
@@ -620,7 +624,10 @@ func newConfig(logger *zap.Logger) *Config {
cfg.appendKernelCmdline, "Additional kernel command line arguments")
flag.BoolVar(&cfg.skipCgroupManagement, "skip-cgroup-management",
cfg.skipCgroupManagement,
"Don't try to manage CPU (use if running alongside container-mgr)")
"Don't try to manage CPU (use if running alongside container-mgr, or if dummy CPU server is enabled)")
flag.BoolVar(&cfg.enableDummyCPUServer, "enable-dummy-cpu-server",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually plan to use runner without the enable-dummy-cpu-server?

I don't mind that it is there, but I think I'd run the server unconditionally, until we (likely) decide to abandon it overall.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually plan to use runner without the enable-dummy-cpu-server?

For now, yes. We haven't come to a decision beyond that.

@sharnoff sharnoff enabled auto-merge (squash) August 29, 2024 14:40
@sharnoff
Copy link
Member Author

Fyi @Omrigan -- merging this, will probably conflict with #1054.

@sharnoff sharnoff merged commit ab457b7 into main Aug 29, 2024
18 checks passed
@sharnoff sharnoff deleted the sharnoff/neonvm-disable-cpu-cgroup branch August 29, 2024 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants