Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to include compatibility with RHEL9 #2

Open
snowbird294 opened this issue Feb 8, 2023 · 11 comments
Open

Update to include compatibility with RHEL9 #2

snowbird294 opened this issue Feb 8, 2023 · 11 comments

Comments

@snowbird294
Copy link

RHEL 9 was released after the last published update of arbiter2. A number of cgroups changes were implemented in RHEL 9. Are there plans to update arbiter to match?

The first error I'm running into is "ERROR: processes.memsw = TRUE" isn't available, but digging down, theres another error about "cgroup hierarchy doesn't exist." I can see at least part of the cgroups hierarchy available under /sys/fs/cgroup, including memory inside cgroup.controllers. I believe this indicates that cgroups has changed and that is the compatibility issue with Arbiter.

@jay-mckay
Copy link
Collaborator

RHEL 9 does in fact mount cgroups v2 by default. Arbiter2 relies on the cgroups v1 hierarchy, and currently does not work with v2. Work is being done to support both versions, but may take some time. In the mean time, and if your situation allows, you can mount v1 by default and arbiter should (although my testing has been very limited) work like normal. Can do this with the kernel boot parameter: "systemd.unified_cgroup_hierarchy=0".

@singhsaluja
Copy link

Thanks for your excellent work! Arbiter2 on our HPC login nodes has been a real lifesaver. We're in the process of upgrading to RHEL/9.2. Do you know when Arbiter will support RHEL9? Thanks!

@snowbird294
Copy link
Author

@jay-mckay is there an update for supporting cgroups v2?

@snowbird294
Copy link
Author

@jay-mckay we're having issues with arbiter2 not working even with the grub/kernel parameters set. Is there an update for the conversion to RHEL9?

@singhsaluja
Copy link

@snowbird294 We switched our login nodes to cgroups v1 to make arbiter work but our compute nodes are on default cgroups v2. In case you're still keen on getting Arbiter set up on RHEL9. It really comes in handy!

@snowbird294
Copy link
Author

Found my bug.

If you start as root and su - $USER and make a bad process happen, arbiter doesn't see it because you started as a safe user (root). user-$USER.slice wasn't created so arbiter didn't even see it.

Would still like an update on the state of v2 integration.

@jay-mckay
Copy link
Collaborator

@snowbird294 Hello, sorry to keep you waiting.

An update regarding v2:

Because the update changed the filesystem hierarchy, a lot of the cgroup manipulation in arbiter would have to be completely rewritten. For this reason and a few others (ease of install/deployment, maintainability of the code base) we have decided to create a new version of arbiter, which includes a major architectural redesign. This version will support both versions of cgroups out of the box without any additional configuration.

In conclusion, this means we are not going to support v2 in Arbiter2. We are in the final stages of development now, and will be at PEARC this year to showcase the work we have done. We will announce the release of this in the coming months.

We will continue to support Arbiter2, but we will probably not be adding support for v2.

@ellestad
Copy link

ellestad commented Sep 3, 2024

I see PEARC has come and gone, any update on the time frame for the next version of of arbiter2 with cgroups v2 support?

@jay-mckay
Copy link
Collaborator

The new version (which we are calling Arbiter 3) is in the final testing stages, and should be released before the end of the year, probably sooner rather than later.

@dch1fc
Copy link

dch1fc commented Oct 30, 2024

Hi Jay,

any update on the time frame of the release of Arbiter3?

@jay-mckay
Copy link
Collaborator

@dch1fc Arbiter3 is still in testing, but most of the features are complete. Any interested parties can play around with the code here, but we are still finding bugs and fixing them as we test. We are looking to release the first stable version at the beginning of December of this year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants