-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRAYSAT-1878: Remove automatic cronjob recreation from bootsys
#244
CRAYSAT-1878: Remove automatic cronjob recreation from bootsys
#244
Conversation
cbb6bdc
to
116f2b8
Compare
Testing on rocket has been completed. The step that un-suspends the hms-discovery cronjob and waits for a job to be scheduled now completes very quickly in my testing thanks to the minor tweaks made here. Before executing the
Executing the command:
Looking at the cronjob and jobs afterwards:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
116f2b8
to
1123c52
Compare
Remove the step that automatically checks for and re-creates stuck Kubernetes CronJobs from the `platform-services` stage of `sat bootsys boot`. This should not be necessary anymore starting in Kubernetes 1.21, which made a new CronJobControllerV2 the default. In addition, improve the logic of the HMSDiscoveryScheduledWaiter, so that it will more reliably detect when an `hms-discovery` Job has been scheduled for the CronJob. Pass in an explicit `start_time`, so that we can look for any jobs created for the CronJob after it is re-enabled. This ensures we won't miss the first one, which could be scheduled between when we set `suspend=False` on the CronJob and when we create the `HMSDiscoveryScheduledWaiter`. Test Description: Tested on rocket as follows: * Suspended the `hms-discovery` CronJob * Ran `sat bootsys boot --stage cabinet-power` * Verified that it correctly identified when the CronJob was scheduled
PyCharm was complaining about this. Make it happy.
1123c52
to
97368b2
Compare
Summary and Scope
Remove the step that automatically checks for and re-creates stuck
Kubernetes CronJobs from the
platform-services
stage ofsat bootsys boot
. This should not be necessary anymore starting in Kubernetes 1.21,which made a new CronJobControllerV2 the default.
In addition, improve the logic of the HMSDiscoveryScheduledWaiter, so
that it will more reliably detect when an
hms-discovery
Job has beenscheduled for the CronJob. Pass in an explicit
start_time
, so that wecan look for any jobs created for the CronJob after it is re-enabled.
This ensures we won't miss the first one, which could be scheduled
between when we set
suspend=False
on the CronJob and when we createthe
HMSDiscoveryScheduledWaiter
.Issues and Related PRs
Testing
Tested on:
Test description:
Tested on rocket as follows:
hms-discovery
CronJobsat bootsys boot --stage cabinet-power
Risks and Mitigations
Should be pretty low-risk. This removes functionality that has caused more problems than it solved. It can always be executed manually as documented, if needed.
Pull Request Checklist