Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the medusa standalone pod #1304

Merged
merged 2 commits into from
May 6, 2024

Conversation

adejanovski
Copy link
Contributor

What this PR does:
Remove the medusa standalone pod which serves no real purpose today and can be traded for the medusa containers running in the Cassandra pods

Which issue(s) this PR fixes:
Fixes #1066

Checklist

  • Changes manually tested
  • Automated Tests added/updated
  • Documentation added/updated
  • CHANGELOG.md updated (not required for documentation PRs)
  • CLA Signed: DataStax CLA

@adejanovski adejanovski marked this pull request as ready for review May 3, 2024 06:29
@adejanovski adejanovski requested a review from a team as a code owner May 3, 2024 06:29
Copy link
Member

@Miles-Garnsey Miles-Garnsey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments where I'm unclear how the logic is supposed to work. I think there are a few problems in there, but I'd like to be mistaken.

More generally, is there a test for this functionality? I can see a lot of stuff removed from the tests, but a straight forward (ideally e2e) test that we can indeed retrieve a restore mapping would be nice to see.

controllers/medusa/controllers_test.go Show resolved Hide resolved
if medusaClient, err := r.ClientFactory.NewClient(ctx, addr); err != nil {
request.Log.Error(err, "Failed to create Medusa client", "address", addr)
} else {
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: This logic doesn't appear to make very much sense. You're checking for an error, then logging the error, then checking for the error again and returning it?? What am I missing here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: this all really needs a refactor too, e.g. why is this logger sitting on the request struct not the reconciler struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed the logger and the method signature 👍

return nil, err
}
for _, pod := range pods {
addr := net.JoinHostPort(pod.Status.PodIP, fmt.Sprint(shared.BackupSidecarPort))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: I'm not sure how this is supposed to work, on the one hand the spec in the ticket calls for there to be logic which sequentially tries different pods until it finds one to whose Medusa container it can connect.

But that isn't what is happening here. If I'm not mistaken, you're going to error out if you can't connect to the first pod you try. Moreover, if the ordering of the pods is deterministic, then every time you hit this loop you'll obtain the same error, and you'll never progress to trying other sidecar containers in other pods which may be available.

If you have to do it this way (rather than having some real retry logic) at least consider randomising the order of the pods in this slice so that you're trying more than just the first one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first check of err != nil in the else clause needs to be removed for this to work.
Not sure how it ended up there, good catch 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds about right. Suggestion: I think you should have a test here and have a mock where the first pod fails and the logic has to move on to another pod. Otherwise we aren't checking that this actually meets the specification.

Copy link

sonarqubecloud bot commented May 6, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
45.1% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

Copy link
Contributor Author

@adejanovski adejanovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with the required changes, could you do another round @Miles-Garnsey ?

return nil, err
}
for _, pod := range pods {
addr := net.JoinHostPort(pod.Status.PodIP, fmt.Sprint(shared.BackupSidecarPort))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first check of err != nil in the else clause needs to be removed for this to work.
Not sure how it ended up there, good catch 👍

if medusaClient, err := r.ClientFactory.NewClient(ctx, addr); err != nil {
request.Log.Error(err, "Failed to create Medusa client", "address", addr)
} else {
if err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed the logger and the method signature 👍

Copy link
Member

@Miles-Garnsey Miles-Garnsey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more issues here, I think you might want to test more thoroughly, and also think more through when you want to return and when you want to move to the next pod.

return nil, err
}
for _, pod := range pods {
addr := net.JoinHostPort(pod.Status.PodIP, fmt.Sprint(shared.BackupSidecarPort))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds about right. Suggestion: I think you should have a test here and have a mock where the first pod fails and the logic has to move on to another pod. Otherwise we aren't checking that this actually meets the specification.

if medusaClient, err := r.ClientFactory.NewClient(ctx, addr); err != nil {
logger.Error(err, "Failed to create Medusa client", "address", addr)
} else {
restoreHostMap, err := medusa.GetHostMap(request.Datacenter, *request.RestoreJob, medusaClient, ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably another issue here. If GetHostMap fails due to network issues, you surely want to continue onto the next pod again don't you? Right now this returns...

Copy link
Member

@Miles-Garnsey Miles-Garnsey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing some refactoring later, so I'm approving this for now.

@adejanovski adejanovski merged commit 5dba349 into main May 6, 2024
59 of 60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Do not create Medusa standalone POD when it's not required
2 participants