Skip to content

Commit

Permalink
Implement new tasks, fix task bugs and improve testing (#584)
Browse files Browse the repository at this point in the history
* Add flush and garbagecollect methods, sync and async

* Implement new tasks, flush and garbagecollect

* Modify Conditions to be metav1.Condition as it was close enough to JobCondition. Also, fix sync pod annotations to patch the correct instance. Modify the task Validator to mark the task as Failed instead of retrying until the task is fixed (and only logging to cass-operator logs why it is failing).

* Fix a flake in replacenode tests that was due to the speed of the requeue causing a matching timing with the pod delete (before recreate). Also, implement scrub and compaction calls.

* Add compaction and scrub tasks with tests. Improve testing in control tasks envtests by allowing to verify the payload sent to the management-api

* Add rebuild task to the decommission_dc e2e test

* Remove unintentional focused test

* Fix filename typo

* rebuild_task datacenter -> source_datacenter

* Add logging to show the jobId that we fetch

* Verify in the test_all_the_things that the cleanup task has completed

* Add rack filtering to all genericPodFilter jobs
  • Loading branch information
burmanm authored Oct 20, 2023
1 parent 34e2ae6 commit bd3c39f
Show file tree
Hide file tree
Showing 13 changed files with 657 additions and 131 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@ Changelog for Cass Operator, new PRs should update the `main / unreleased` secti
## unreleased

* [CHANGE] [#573](https://github.com/k8ssandra/cass-operator/issues/573) Add the namespace as env variable in the server-system-logger container to label metrics with.
* [ENHANCEMENT] [#580](https://github.com/k8ssandra/cass-operator/issues/580) Add garbageCollect CassandraTask that removes deleted data
* [ENHANCEMENT] [#578](https://github.com/k8ssandra/cass-operator/issues/578) Add flush CassandraTask that flushed memtables to the disk
* [ENHANCEMENT] [#586](https://github.com/k8ssandra/cass-operator/issues/578) Add scrub CassandraTask that allows rebuilding SSTables
* [ENHANCEMENT] [#582](https://github.com/k8ssandra/cass-operator/issues/582) Add compaction CassandraTask to force a compaction
* [BUGFIX] [#585](https://github.com/k8ssandra/cass-operator/issues/585) If task validation fails, stop processing the task and mark the validation error to Failed condition

## v1.17.2

Expand Down
46 changes: 19 additions & 27 deletions apis/control/v1alpha1/cassandratask_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@ const (
CommandCompaction CassandraCommand = "compact"
CommandScrub CassandraCommand = "scrub"
CommandMove CassandraCommand = "move"
CommandGarbageCollect CassandraCommand = "garbagecollect"
CommandFlush CassandraCommand = "flush"
)

type CassandraJob struct {
Expand All @@ -91,10 +93,22 @@ type CassandraJob struct {
}

type JobArguments struct {
KeyspaceName string `json:"keyspace_name,omitempty"`
SourceDatacenter string `json:"source_datacenter,omitempty"`
PodName string `json:"pod_name,omitempty"`
RackName string `json:"rack,omitempty"`
KeyspaceName string `json:"keyspace_name,omitempty"`
SourceDatacenter string `json:"source_datacenter,omitempty"`
PodName string `json:"pod_name,omitempty"`
RackName string `json:"rack,omitempty"`
Tables []string `json:"tables,omitempty"`
JobsCount *int `json:"jobs,omitempty"`

// Scrub arguments
NoValidate bool `json:"no_validate,omitempty"`
NoSnapshot bool `json:"no_snapshot,omitempty"`
SkipCorrupted bool `json:"skip_corrupted,omitempty"`

// Compaction arguments
SplitOutput bool `json:"split_output,omitempty"`
StartToken string `json:"start_token,omitempty"`
EndToken string `json:"end_token,omitempty"`

// NewTokens is a map of pod names to their newly-assigned tokens. Required for the move
// command, ignored otherwise. Pods referenced in this map must exist; any existing pod not
Expand All @@ -104,9 +118,6 @@ type JobArguments struct {

// CassandraTaskStatus defines the observed state of CassandraJob
type CassandraTaskStatus struct {

// TODO Status and Conditions is almost 1:1 to Kubernetes Job's definitions.

// The latest available observations of an object's current state. When a Job
// fails, one of the conditions will have type "Failed" and status true. When
// a Job is suspended, one of the conditions will have type "Suspended" and
Expand All @@ -118,7 +129,7 @@ type CassandraTaskStatus struct {
// +patchMergeKey=type
// +patchStrategy=merge
// +listType=atomic
Conditions []JobCondition `json:"conditions,omitempty"`
Conditions []metav1.Condition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type" protobuf:"bytes,1,rep,name=conditions"`

// Represents time when the job controller started processing a job. When a
// Job is created in the suspended state, this field is not set until the
Expand Down Expand Up @@ -158,25 +169,6 @@ const (
JobRunning JobConditionType = "Running"
)

type JobCondition struct {
// Type of job condition, Complete or Failed.
Type JobConditionType `json:"type"`
// Status of the condition, one of True, False, Unknown.
Status corev1.ConditionStatus `json:"status"`
// Last time the condition was checked.
// +optional
LastProbeTime metav1.Time `json:"lastProbeTime,omitempty"`
// Last time the condition transit from one status to another.
// +optional
LastTransitionTime metav1.Time `json:"lastTransitionTime,omitempty"`
// (brief) reason for the condition's last transition.
// +optional
Reason string `json:"reason,omitempty"`
// Human readable message indicating details about last transition.
// +optional
Message string `json:"message,omitempty"`
}

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status

Expand Down
30 changes: 12 additions & 18 deletions apis/control/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

80 changes: 69 additions & 11 deletions config/crd/bases/control.k8ssandra.io_cassandratasks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,10 @@ spec:
args:
description: Arguments are additional parameters for the command
properties:
end_token:
type: string
jobs:
type: integer
keyspace_name:
type: string
new_tokens:
Expand All @@ -118,12 +122,28 @@ spec:
Pods referenced in this map must exist; any existing pod
not referenced in this map will not be moved.
type: object
no_snapshot:
type: boolean
no_validate:
description: Scrub arguments
type: boolean
pod_name:
type: string
rack:
type: string
skip_corrupted:
type: boolean
source_datacenter:
type: string
split_output:
description: Compaction arguments
type: boolean
start_token:
type: string
tables:
items:
type: string
type: array
type: object
command:
description: Command defines what is run against Cassandra pods
Expand Down Expand Up @@ -176,30 +196,68 @@ spec:
one of the conditions will have type "Complete" and status true.
More info: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/'
items:
description: "Condition contains details for one aspect of the current
state of this API Resource. --- This struct is intended for direct
use as an array at the field path .status.conditions. For example,
\n type FooStatus struct{ // Represents the observations of a
foo's current state. // Known .status.conditions.type are: \"Available\",
\"Progressing\", and \"Degraded\" // +patchMergeKey=type // +patchStrategy=merge
// +listType=map // +listMapKey=type Conditions []metav1.Condition
`json:\"conditions,omitempty\" patchStrategy:\"merge\" patchMergeKey:\"type\"
protobuf:\"bytes,1,rep,name=conditions\"` \n // other fields }"
properties:
lastProbeTime:
description: Last time the condition was checked.
format: date-time
type: string
lastTransitionTime:
description: Last time the condition transit from one status
to another.
description: lastTransitionTime is the last time the condition
transitioned from one status to another. This should be when
the underlying condition changed. If that is not known, then
using the time when the API field changed is acceptable.
format: date-time
type: string
message:
description: Human readable message indicating details about
last transition.
description: message is a human readable message indicating
details about the transition. This may be an empty string.
maxLength: 32768
type: string
observedGeneration:
description: observedGeneration represents the .metadata.generation
that the condition was set based upon. For instance, if .metadata.generation
is currently 12, but the .status.conditions[x].observedGeneration
is 9, the condition is out of date with respect to the current
state of the instance.
format: int64
minimum: 0
type: integer
reason:
description: (brief) reason for the condition's last transition.
description: reason contains a programmatic identifier indicating
the reason for the condition's last transition. Producers
of specific condition types may define expected values and
meanings for this field, and whether the values are considered
a guaranteed API. The value should be a CamelCase string.
This field may not be empty.
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
type: string
status:
description: Status of the condition, one of True, False, Unknown.
description: status of the condition, one of True, False, Unknown.
enum:
- "True"
- "False"
- Unknown
type: string
type:
description: Type of job condition, Complete or Failed.
description: type of condition in CamelCase or in foo.example.com/CamelCase.
--- Many .condition.type values are consistent across resources
like Available, but because arbitrary conditions can be useful
(see .node.status.conditions), the ability to deconflict is
important. The regex it matches is (dns1123SubdomainFmt/)?(qualifiedNameFmt)
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
required:
- lastTransitionTime
- message
- reason
- status
- type
type: object
Expand Down
Loading

0 comments on commit bd3c39f

Please sign in to comment.