Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignore Read-only file system error on sysctl set #910

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

haircommander
Copy link

fixes #825

Needed for running podman in a kubernetes pod. I am investigating if we can not set Proc to unmasked and get away with enough podman operations.

Copy link
Contributor

openshift-ci bot commented Jan 29, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: haircommander
Once this PR has been reviewed and has the lgtm label, please assign umohnani8 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mheon
Copy link
Member

mheon commented Jan 29, 2024

This is probably going to break certain functionality (internal networks, definitely, probably also IPv6 will act strange since we can't configure neighbor advertisements properly).

@haircommander
Copy link
Author

This is probably going to break certain functionality (internal networks, definitely, probably also IPv6 will act strange since we can't configure neighbor advertisements properly).

though in those cases, they were broken to begin with. This is just making a fatal error non fatal. If netavark truly needed to set the sysctls then something will break down the line I think

@Luap99
Copy link
Member

Luap99 commented Jan 30, 2024

I would at least have a warning printed (although I guess that may defeat the purpose if every single container start-up prints these).
The main issue is that some things simple don't work. If routing is not enabled then there will be no external network connectivity, route_localnet would be needed to allow port forwarding via 127.0.0.1. There may be some ipv6 systcl that are not that important but I would need to double check which ones we really need.
Also internal networks will not work at all which will be a major issue as people may depend on the security that a container cannot connect to the internet,etc...

Ignoring these problems will turn into a nightmare of random support requests why xyz isn't working in their env and network issues are already a pain to deal with due to often lacking reproducer and/or race conditions. I don't what to deal with more of these.

I think we should still make the caller enforce that the sysctls are already set to the right values, i.e. it should be possible to set net.ipv4.conf.default.... for the outer container/pod so that the values are already correct when we create interfaces. In this case the current logic already works as we first read the value and to not try to write if it is correct.

@haircommander
Copy link
Author

I think we should still make the caller enforce that the sysctls are already set to the right values

so the code already does that. if the sysctl is set to the right value it skips. the problem is when setting values like net.ipv4.eth0 or something, which does happen in the default flow. eth0 doesn't exist at container start, so we can't prepopulate the sysctls..

I am open to logging something if we skip, or passing an option to skip from podman. TBH I have hit other issues when testing with this patch so I'm not even sure if we should go down this route...

@Luap99
Copy link
Member

Luap99 commented Feb 2, 2024

the problem is when setting values like net.ipv4.eth0 or something, which does happen in the default flow. eth0 doesn't exist at container start, so we can't prepopulate the sysctls..

That is why I suggested the use of the special interface name default in the sysctl. All newly created interfaces will inherit the values from the default one and you can set the default sysctl to the correct values in the parent container because it will always exist.

Anyhow I tend to agree this all is less then ideal and would not be future proof, i.e. if we start setting a new sysctl then the parent containers config would need to be updated with it as well. It would be hard to maintain as each update could result in breaking changes for users going that route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC: consider read only sysctl errors as non fatal
3 participants