Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure dns-default pods run on all nodes for 4.18+ clusters #2286

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mmazur
Copy link
Member

@mmazur mmazur commented Dec 3, 2024

What type of PR is this?

fix

What this PR does / why we need it?

DaemonSets iptables-alerter (-n openshift-network-operator) and dns-default (-n openshift-dns) should run across all nodes. But for OSD/ROSA Classic they don't due to us having infra nodes and the DSs' tolerations not accounting for that properly (see this thread).

Apparently this has not been an issue in practice thus far, but we probably want this to work as intended going forward, so I've enabled this for 4.18+ clusters only.

Note: this PR only addresses the dns-default pods, since it's not possible to effectively patch the iptables-alerter DS, as that's tightly controlled by its operator and getsquickly reverted.

Which Jira/Github issue(s) this PR fixes?

Fixes OSD-26887

Special notes for your reviewer:

Pre-checks (if applicable):

  • Tested latest changes against a cluster

  • Included documentation changes with PR

  • If this is a new object that is not intended for the FedRAMP environment (if unsure, please reach out to team FedRAMP), please exclude it with:

    matchExpressions:
    - key: api.openshift.com/fedramp
      operator: NotIn
      values: ["true"]

@openshift-ci openshift-ci bot requested review from bmeng and xiaoyu74 December 3, 2024 13:17
Copy link
Contributor

openshift-ci bot commented Dec 3, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mmazur
Once this PR has been reviewed and has the lgtm label, please assign apahim for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mmazur mmazur marked this pull request as draft December 3, 2024 13:17
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 3, 2024
@mmazur mmazur changed the title Make sure iptables-alerter and dns-default pods run on all nodes Make sure iptables-alerter pods run on all nodes Dec 3, 2024
@mmazur mmazur force-pushed the OSD-26887 branch 2 times, most recently from 8905290 to 96f68ca Compare December 3, 2024 14:04
@mmazur mmazur marked this pull request as ready for review December 3, 2024 14:04
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 3, 2024
@mmazur
Copy link
Member Author

mmazur commented Dec 3, 2024

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 3, 2024
@mmazur mmazur changed the title Make sure iptables-alerter pods run on all nodes Make sure dns-default pods run on all nodes Dec 6, 2024
@jewzaam jewzaam removed their request for review December 6, 2024 14:13
@mmazur
Copy link
Member Author

mmazur commented Dec 6, 2024

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 6, 2024
@mmazur mmazur changed the title Make sure dns-default pods run on all nodes Make sure dns-default pods run on all nodes for 4.18+ clusters Dec 6, 2024
Copy link
Contributor

openshift-ci bot commented Dec 6, 2024

@mmazur: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@bmeng
Copy link
Contributor

bmeng commented Dec 9, 2024

I remember @bergmannf was working on the similar thing and turned out it is not working well.
#2144
#2158

@bergmannf
Copy link
Contributor

Thanks for the ping @bmeng - Yes there is an ADR about this, because there are strange edge-cases where we had issues with customer workloads that led to a big escalation.
In case we run this on all nodes it can lead to issues with high-churn clusters - and without the ADR we break the customer being able to decide where to run the pods.

@bergmannf
Copy link
Contributor

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants