-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallel e2e testing with different kubeconfigs #1020
base: master
Are you sure you want to change the base?
Conversation
author marbarta <[email protected]> 1690214601 +0200 committer marbarta <[email protected]> 1690470420 +0200 create test report
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't get to go through every single line just yet but went through most of them - looking sharp in general.
Also left some comments that are mostly about formatting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, overall doesn't look too bad, it's just very confusing to me in some ways, I really was hoping a simpler solution is possible to achieve this.
Otherwise I would only ask that we try to refrain from these very long PRs (I know it's not always possible, this is just a reminder) when we can, I'm pretty sure I missed a lot of stuff in there because I can't pay attention long enough to review something like this properly.
.github/workflows/e2e-test.yaml
Outdated
kind_k8s_version: v1.24.13 | ||
kubeconfig_dir_path: tests/e2e/kubeconfigs | ||
|
||
# TODO: ZookeeperCluster create time out always on the second kind cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Is it an actual timeout issue or can Zookeeper not start for some reason and we retry until we reach a timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you can see the problem: https://github.com/banzaicloud/koperator/runs/15416787902
It would be nice to get more information why is this fail (like pod status)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in local environment all of the tests passed when multiple k8s cluster is used in parallel way I suspect that the problem is with timeouts or something else on github
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test the PR with multicluster input on your local.
This PR is not about to run tests on github on multiple kind cluster.
tests/e2e/pkg/tests/tests_test.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this WIP? A lot of unit tests seem to be missing from this still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made unit test for the complex function.
Can you see another function with complex logic that need to be unit tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is we only have unit tests here for one struct's functions and nothing for any of the other functions. I know not everything can be unit tested easily but it would be nice to add most we can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added multiple unit tests for functions where there are some logic.
tests/e2e/const.go
Outdated
@@ -53,7 +53,7 @@ const ( | |||
defaultTopicCreationWaitTime = 10 * time.Second | |||
defaultUserCreationWaitTime = 10 * time.Second | |||
|
|||
kafkaClusterCreateTimeout = 600 * time.Second // Increased from 600 to 700 for multiple kind | |||
kafkaClusterCreateTimeout = 600 * time.Second | |||
kafkaClusterResourceCleanupTimeout = 120 * time.Second | |||
kcatDeleetionTimeout = 40 * time.Second | |||
zookeeperClusterCreateTimeout = 7 * time.Minute // Increased from 4 to 7 for multiple kind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as on line 56, I don't think we need this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain why should we remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not sure what sense it makes to have the explanation for this timeout here. For giving information to reviewers it would be fine as a review comment but I don't think this needs to be a comment on the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 that the comment "// Increased from 4 to 7 for multiple kind"
can be removed since it only provides the context about why the change is made, but it doesn't provide any valuable info itself along with the source code.
Or perhaps we can change the comment to something like this:
zookeeperClusterCreateTimeout = 7 * time.Minute // Increased from 4 to 7 for multiple kind | |
// increase timeout for supporting multiple Kind clusters | |
zookeeperClusterCreateTimeout = 7 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove but in this case who will face with this issue on multiple kind cluster on github will not know what was the original setting for one kind cluster setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think default values and the reason of their changes are valuable in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the increased timeout was not helped, I put back to the original value and I removed the comment
fixed: 129e7fe
tests/e2e/pkg/common/common.go
Outdated
return kubecontext, nil | ||
} | ||
|
||
// GetRawConfig creates a raw clientcmd api config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean in my head a get function isn't supposed to create things, just get them but what do the others think? @panyuenlau @mihaialexandrescu @pregnor ?
tests/e2e/pkg/tests/tests_test.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is we only have unit tests here for one struct's functions and nothing for any of the other functions. I know not everything can be unit tested easily but it would be nice to add most we can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks okay to me, though I still have some comments open and there are some TODOs so I'll not approve just yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, will approve when the un-resolved comments are addressed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the excellent work!
Description
Type of Change
Checklist