Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCO-52: add MCO API into openshift/api #1453

Merged
merged 10 commits into from
Sep 29, 2023

Conversation

jkyros
Copy link
Contributor

@jkyros jkyros commented May 4, 2023

This is an existing API, so fortunately or unfortunately we're probably stuck with some of the...tech debt here ☹️

We've tried this before, and while there wasn't really anything we were "hard blocked" on, it never quite got finished.

Iif you want to re-read previous attempts/reviews:

This time is going to be the one though, I can feel it! 😄

After this is done, we'll need to do client-go and pull it back into the MCO:

What I did:

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 4, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented May 4, 2023

@jkyros: This pull request references MCO-52 which is a valid jira issue.

In response to this:

This is an existing API, so fortunately or unfortunately we're probably stuck with some of the...tech debt here ☹️

We've tried this before, and while there wasn't really anything we were "hard blocked" on, it never quite got finished.

Iif you want to re-read previous attempts/reviews:

This time is going to be the one though, I can feel it! 😄

What I did:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 4, 2023

Hello @jkyros! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 4, 2023
@openshift-ci openshift-ci bot requested review from mfojtik and soltysh May 4, 2023 22:59
Comment on lines 1 to 11
package v1

import (
"fmt"

corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// NewMachineConfigPoolCondition creates a new MachineConfigPool condition.
func NewMachineConfigPoolCondition(condType MachineConfigPoolConditionType, status corev1.ConditionStatus, reason, message string) *MachineConfigPoolCondition {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These helpers aren't used anywhere but in the MCO to my knowledge. I assume you'll want them excluded, but I figured there was no harm in asking? 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helpers have been removed.

Comment on lines 29 to 37
type ControllerConfig struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`

// TODO(jkyros): inconsistent historical generation resulted in the controllerconfig CRD being
// generated with all fields required, while everything else was generated with optional

// +kubebuilder:validation:Required
// +required
Spec ControllerConfigSpec `json:"spec" protobuf:"bytes,2,opt,name=spec"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that failed integration test made me poke through all the required fields.

The MCO hasn't been historically great at required/non-required fields since most of our CRD updates had been done manually. It looks like ControllerConfig was originally (years ago) generated with "required by default" at the package level and has been updated that way, while everything else was generated with "optional by default" at the package level and maintained that way.

This PR initially defaulted to "fields required by default" at the package level. Looking at the rest of the APIs in the repo, it looks like there's a mix?

I've adjusted this now so that:

  • // +kubebuilder:validation:Optional is set at the package level
  • the ControllerConfig fields that were previously required are set explicitly to required
  • the tests should be fixed and now supply the required fields (I will clean the data up before merge)

Generally though, what are your "best practice" thoughts on how required fields should be handled:

  • Required by default, mark everything we don't want required as optional or
  • Optional by default, mark everything we don't want optional as required
  • Mark literally every field explicitly so it doesn't matter? 😄

@deads2k
Copy link
Contributor

deads2k commented Jun 14, 2023

/hold

We definitely want this, but we have some other churn in progress that must complete first. Then we take this to reduce future debt in 4.14.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 14, 2023
@jkyros jkyros force-pushed the migrate-mco-api branch from 7aeab56 to 84bb609 Compare June 15, 2023 17:29
@jkyros
Copy link
Contributor Author

jkyros commented Jun 15, 2023

Rebased, set our runlevel to 80 to match existing MCO, adjusted approval annotation to match this PR, pulled out the helpers 😄

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jun 15, 2023

@jkyros: This pull request references MCO-52 which is a valid jira issue.

In response to this:

This is an existing API, so fortunately or unfortunately we're probably stuck with some of the...tech debt here ☹️

We've tried this before, and while there wasn't really anything we were "hard blocked" on, it never quite got finished.

Iif you want to re-read previous attempts/reviews:

This time is going to be the one though, I can feel it! 😄

After this is done, we'll need to do client-go and pull it back into the MCO:

What I did:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

// +required
Spec ControllerConfigSpec `json:"spec" protobuf:"bytes,2,opt,name=spec"`
// +optional
Status ControllerConfigStatus `json:"status" protobuf:"bytes,3,opt,name=status"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MCO itself does not care about the protobufs or use them, so I'm cool with removing them.

Previous attempts at migration seemed to really care a lot about getting them in there but the reason was unclear. It could have been either cargo-cult or for something that has since gone away?

I just want to make sure we don't have some important thing somewhere that's needs protobufs that I don't know about.

Should I take them out?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protobufs have no effect for CRD types, so they should be removed, please make that its own commit

@jkyros
Copy link
Contributor Author

jkyros commented Aug 2, 2023

Rebased and trued up with recent MCO merges. I'll squash the extra commits out when we're cool with this.

operator/v1/register.go Show resolved Hide resolved
Comment on lines 116 to 119
required:
- lastTransitionTime
- status
- type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh right, so in the copy/paste some generation has made it required, and this reverts that? Would be good to squash that change somehow so it's obvious?

Perhaps if we do the copy/paste, then the next commit fixes all the optional vs required, then generate as the final commit, it will be more obvious that there's actually no diff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I tried to squish the commits down to a more clear:

  • add existing -> modify annotations -> generate

hopefully that's...at least a little better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I was hoping for, let me re-review and i'll let you know if anything else can be done :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generation of updates is in the required/optional commit, can we move it into the final commit so that all generation happens in one go?

Not sure currently if these changes to drop the required from the schema are being reverted later, that's why I'm asking for all generation in one commit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so these required to optional transitions still exist even when the update comes at the end? Why are these fields switching from required to optional?

Copy link
Contributor Author

@jkyros jkyros Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember right (it's been 3 months or so) it was:

  • copy the MCO CRD in here
  • run the generator to make sure it worked -- it originally didn't because it was a mess -- but that first run regenerated a "wrong" CRD because a bunch of our fields were untagged and the generator here defaults to required rather than optional (and everything that wasn't controllerconfig in the MCO was generated with optional). So that first "sanity check" generation run did generate, but "erroneously" marked them required.
  • then I trued up the types.go and regenerated (in that later "annotate all fields properly" commit which you now see)

I can go back and correct the base CRD for this PR if you'd like, but I'm not concerned by the fields filpping here, they are flipping back to how they "should have been".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link to where this CRD currently gets installed into payload to show that these fields aren't correct there? If that's the case, at least we have a record of it then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see, you've done that already nvm

@@ -1,5 +1,6 @@
// +k8s:deepcopy-gen=package,register
// +groupName=machineconfiguration.openshift.io

// +kubebuilder:validation:Optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you chose to add this? Moving fields from being required to optional could be a breaking change so we need to be careful here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically..."mistakes were made" #1453 (comment). Half of our stuff was generated one way, half of it was generated the other. If you'd rather I "invert" it and go through and tag everything optional that was optional instead, I can do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this repo, we ask that every field be marked as required or optional, so that we don't have the ambiguity of having to work out which is which. I intend to set a linter on this one day.
Could you make sure you have either +optional or +kubebuilder:validation:Required on all fields as appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yep, I have now marked all the fields as optional or required.

I did keep the +kubebuilder:validation:Optional for now because:

  • in the existing MCO we a field that is marked required in our CRD but is omitempty in types.go (maintaining CRDs by hand...) and that seemed to be the only way to keep parity.
  • I did generate both ways (with global optional in doc.go and without) and that field was the only difference

(I don't think preserving it matters that much, but I figure we can revisit it next time we make an MCO API change and/or do some "opinionated cleanup" on our API?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any field that isn't explicitly marked as optional or required, but happens to have omitempty, becomes optional by the logic of the generators.

Since the CRD had it as required, it should remain as required

machineconfiguration/v1/types.go Outdated Show resolved Hide resolved
@jkyros
Copy link
Contributor Author

jkyros commented Aug 9, 2023

Changes are:

  • rebased
  • squashed out some of the intermediate commits in an attempt to make things clearer
  • removed all usage of +required annotation in favor of +kubebuilder:validation:Required
  • grouped the machineconfiguration operator stuff together in the same commit
  • fixed approval annotations to match this PR
  • fixed test newlines
  • fixed the capitalization of omitEmpty in StaticPodOperatorStatus in operator types (but we can take it out if we don't want to fix it as part of this)

operator/v1/types_machineconfiguration.go Show resolved Hide resolved
Comment on lines 116 to 119
required:
- lastTransitionTime
- status
- type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generation of updates is in the required/optional commit, can we move it into the final commit so that all generation happens in one go?

Not sure currently if these changes to drop the required from the schema are being reverted later, that's why I'm asking for all generation in one commit

Comment on lines 203 to 209
// notBefore is the lower boundary for validity
// +kubebuilder:validation:Required
NotBefore string `json:"notBefore"`

// notAfter is the upper boundary for validity
// +kubebuilder:validation:Required
NotAfter string `json:"notAfter"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what release were these added? If they're in 4.14, we still have the option to change these. I expect that they should actually be *metav1.Time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that this matters, I added these and for sanity of output string just worked better. I can change if we need to

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this status or spec?

Typically timestamps in kube are all *metav1.Time since it has special marshal/unmarshal rules and a standard representation of the time, not sure it's hugely important but had I been doing an API review on this, I would have suggested to be consistent with the other APIs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the controllerconfig status. Want me to make a PR in the MCO to change these types? so we can then change it here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that would be preferable if these fields were introduced this release

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed these time fields to *metav1.Time, I left it as a separate commit but I can squash it out if need be

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like changing this is tricky for backwards compatibility? Since controllerconfig is an implementation detail today, I'm not sure we should block on it.

@jkyros
Copy link
Contributor Author

jkyros commented Aug 9, 2023

Moved all generation to one commit.

@jkyros
Copy link
Contributor Author

jkyros commented Aug 9, 2023

that was weird. gateway timeouts?
/retest

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2023
@jkyros
Copy link
Contributor Author

jkyros commented Sep 28, 2023

Alright, I've:

I notice I'm failing verify-crd-schema with even more errors ( ListsMustHaveSSATags ) now.

  • Do I need to default those lists to //+listType=atomic and we can come back later for "list merge efficiency gains"? Or do we want to merge this to unblock MCO stuff and then come back and address that all at once?

@JoelSpeed
Copy link
Contributor

Do I need to default those lists to //+listType=atomic and we can come back later for "list merge efficiency gains"? Or do we want to merge this to unblock MCO stuff and then come back and address that all at once?

Adding the atomic tag is good, as that will show that the field is treated as atomic. But, you can't change this now. The API is out and changing the behaviour of the merge strategy for the list could cause backwards compatibility issues for clients.

If you can add the tags for all the list type failures, we can ignore the other failures as these are "pre-existing unfixable" issues

Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a few nits and a few questions about outstanding questions, but I think we are looking very close to go, is that correct?

labels:
"openshift.io/operator-managed": ""
annotations:
api-approved.openshift.io: https://github.com/openshift/api/pull/1453
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing release inclusion annotations, is that deliberate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that one wasn't previously in the install manifests, the MCO historically applied it when it came up.

I've added annotations to match the rest of the CRDs:

  include.release.openshift.io/ibm-cloud-managed: "true"
  include.release.openshift.io/self-managed-high-availability: "true"
  include.release.openshift.io/single-node-developer: "true"`

I didn't see any harm...well...unless the CVO has some sort of "these are annotated but I didn't apply them" cleanup/pruning function I don't know about.


// spec is the specification of the desired behavior of the Machine Config Operator
// +kubebuilder:validation:Required
// +required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, don't need this tag, the kubebuilder version is canonical

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, thanks

Comment on lines +32 to +33
// TODO(jkyros): inconsistent historical generation resulted in the controllerconfig CRD being
// generated with all fields required, while everything else was generated with optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this todo resolved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, everything is explicitly tagged now, comment removed

const MachineConfigRoleLabelKey = "machineconfiguration.openshift.io/role"

// KubeletConfigRoleLabelPrefix is the label that must be present in the KubeletConfig CR
const KubeletConfigRoleLabelPrefix = "pools.operator.machineconfiguration.openshift.io/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be fixed pre-merge?

)

// Network contains network related configuration
type NetworkInfo struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be fixed pre-merge?


// ControllerConfigList is a list of ControllerConfig resources
//
// Compatibility level 4: No compatibility is provided, the API can change at any point for any reason. These capabilities should not be used by applications needing long term support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be level 1 right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I got the main type but I missed the List

jkyros and others added 10 commits September 28, 2023 15:20
This explicitly marks fields with optional/required in a way that
matches the original "requiredness" of the API in the MCO repo. Some of
these fields should probably be changed to required --
KubeletConfigCondition members for example should not be entirely
optional, but they always have been, and so for now we're going to leave
them set how they were. Later we can come back and fix them and see what
breaks.

This does set +kubebuilder:validation:Optional, but that's because we
historically have a field in ContinerRuntimeConfigSpec that was both
required in the CRD and tagged as omitempty in types.go and for now I'm
trying to preserve that until we can figure out if it's still necessary.
- Adds cert details from machine-config-operator#3756
- Adds buildcontroller state from machine-config-operator#3731
- Adds image registry bundles from machine-config-operator#3770
- Adds internal registry pull secret from machine-config-operator#3806
@jkyros
Copy link
Contributor Author

jkyros commented Sep 28, 2023

Alright so now I've:

  • added inclusion annotations to the ControllerConfig CRD
  • fixed the ControllerConfigList to be level 1
  • added //+listType=atomic tags to the lists that we own that verify was crying about (but upon further inspection, a lot of them are from embedded objects and other types here we don't own)
  • fixed the required tag on the operator field
  • removed my TODO about the tagging
  • regenerated clients

I'm thinking "not right now" on Trevor's naming changes -- not that I disagree with him on the rationale, I appreciate the input, but I'm unsure of the blast radius since those aren't new fields (they've been in there for years). If we're gonna improve the naming, I'd rather do it as part of some type of holistic approach (and if they're safe to change now, they should be just as safe to change later).

@JoelSpeed
Copy link
Contributor

I can't see any issues in net new fields, however, this type has a number of pre-existing failures causing the verify to fail. The verify compares new to old so with an override now should be ok in the future
/override ci/prow/verify-crd-schema
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 29, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 29, 2023

@jkyros: This pull request references MCO-52 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

This is an existing API, so fortunately or unfortunately we're probably stuck with some of the...tech debt here ☹️

We've tried this before, and while there wasn't really anything we were "hard blocked" on, it never quite got finished.

Iif you want to re-read previous attempts/reviews:

This time is going to be the one though, I can feel it! 😄

After this is done, we'll need to do client-go and pull it back into the MCO:

What I did:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 29, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jkyros, JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 29, 2023

@JoelSpeed: Overrode contexts on behalf of JoelSpeed: ci/prow/verify-crd-schema

In response to this:

I can't see any issues in net new fields, however, this type has a number of pre-existing failures causing the verify to fail. The verify compares new to old so with an override now should be ok in the future
/override ci/prow/verify-crd-schema
/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 29, 2023

@jkyros: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jkyros
Copy link
Contributor Author

jkyros commented Sep 29, 2023

@JoelSpeed are you cool if I release the hold? It's clean, I can safely pull it back into the MCO: openshift/machine-config-operator#3747

@JoelSpeed
Copy link
Contributor

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 29, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 29, 2023

@jkyros: This pull request references MCO-52 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

This is an existing API, so fortunately or unfortunately we're probably stuck with some of the...tech debt here ☹️

We've tried this before, and while there wasn't really anything we were "hard blocked" on, it never quite got finished.

Iif you want to re-read previous attempts/reviews:

This time is going to be the one though, I can feel it! 😄

After this is done, we'll need to do client-go and pull it back into the MCO:

What I did:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants