-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
e2e test for KafkaTopic webhook #1031
base: master
Are you sure you want to change the base?
Conversation
c272723
to
856ccc3
Compare
tests/e2e/kafkatopic_webhook.go
Outdated
) | ||
|
||
// TODO(mihalexa): move to k8s.go | ||
// applyK8sResourceFromTemplateWithDryRun is copy of applyK8sResourceFromTemplate which calls a "--dry-run=<strategy>" kubectl command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this could be solved with just adding an option for dry-run to the original command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such an option would need to become an input parameter and modifying the initial function was overkill, in my mind, as it's plenty good the way it is and our use of it doesn't normally (aka 90+% of the time) need it to execute with --dry-run
in mind.
Outside of what I deemed to be an optimized workflow for webhook testing (aka making use of the dry-run option), in our e2e tests we always want to fully create objects (aka have them written to etcd
) because that's in the nature of the tests: we create objects and we check their functionality.
The initial function is also part of a chain - it is not really standalone. Modifying its signature means modifying the signatures of all functions that relate to it in any way.
I preferred to add it as an optional/alternative function.
I could be convinced of the opposite but right now I don't feel like it is a 50-50 choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I get that but at he same time I'd like to avoid major code duplication if we can, and that in my head is more important than what you've written here but I won't hold up the PR on this if everyone else is fine with this reasoning, it's a personal preference thing for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a difficult topic.
I definitely wouldn't add a dryRunStrategy string
parameter to the original function.
But maybe we should pay the one time cost and add an extraArgs map[string][]string
parameter to that one which is passed down to the TerraTest call through the other helpers.
I know it's a big refactor and I can live with postponing it, but that looks like the desirable ideal solution to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chain in question is : applyK8sResourceFromTemplate -> applyK8sResourceManifestFromString -> now moving into terratest
's k8s
pkg -> KubectlApplyFromStringE -> KubectlApplyE (inserts "apply -f") -> RunKubectlE -> RunKubectlAndGetOutputE -> shell.RunCommandAndGetOutputE.
Part of the issue with this chain is that if we want to use terratest
's KubectlApply<...>
functions (link here), they don't have ...args
in their signature at all - that's the root of the problem and otherwise it would've probably been an easy refactor. The first point in the chain where we get ...args
is RunKubectlE
.
Examples of function signatures relevant to us in the current implementation (also check comments I added at the end of some lines):
func KubectlApplyFromStringE(t testing.TestingT, options *KubectlOptions, configData string) error {
...
return KubectlApplyE(t, options, tmpfile)
}
func KubectlApplyE(t testing.TestingT, options *KubectlOptions, configPath string) error {
return RunKubectlE(t, options, "apply", "-f", configPath) <<<<< first args... ; this is where "apply -f" gets added
}
func RunKubectlE(t testing.TestingT, options *KubectlOptions, args ...string) error { <<<<< first args...
_, err := RunKubectlAndGetOutputE(t, options, args...)
return err
}
It is from the implementations behind that chain of functions that I pieced together a direct path to my goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to take Marton's suggestion from #1031 (comment) (which I am guessing was meant for this thread) and refactor applyK8sResourceManifestFromString
(and upwards one level in our own functions) to approximate the KubectlApply...
from terratest instead of using it directly ? (and in the process eliminate the need for a new function for dry-run ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to take Marton's suggestion from #1031 (comment) (which I am guessing was meant for this thread) and refactor applyK8sResourceManifestFromString (and upwards one level in our own functions) to approximate the KubectlApply... from terratest instead of using it directly ? (and in the process eliminate the need for a new function for dry-run ?)
Yeah, I think that was kind of my point as well, I meant it somewhat that way, I see why my thought was simpler but not working, but the end result seems similar to me.
tests/e2e/kafkatopic_webhook.go
Outdated
requireCreatingKafkaCluster(kubectlOptions, "../../config/samples/simplekafkacluster.yaml") | ||
testWebhookCreateKafkaTopic(kubectlOptions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation of e2e tests shouldn't this be done by this point? Not that these should cause problems in that case, it just might make stuff slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% sure about what you meant in the question. I will try to answer what I understood from it but please follow up if I didn't get it.
I favored having the webhook testing not rely on other previous tests and the flexibility this gives us.
There a few things I considered :
- our resources have different prerequisites (e.g. KafkaTopic requires that the indicated KafkaCluster exists) and we need a clear place to aggregate those
- this PR also aimed to provide a structure where other webhook e2e tests (and their prerequisites) could easily slot in ;
- e.g. kafkatopics may only require a/any kafkacluster but other things may require more intricate or specific setups
- this PR also aimed to provide a structure where other webhook e2e tests (and their prerequisites) could easily slot in ;
- our KafkaCluster related testing may not always be done on the absolutely most minimal setting possible while for the purposes of KafkaTopic webhook testing we don't care about the kafkacluster resource (so minimum is best) but there is no way to go around not having the KafkaCluster.
- this structure allowed me to decouple what kafkaclusters we test when we test kafkaclusters from what kafkaclusters we use when we test webhooks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'm fine with this reasoning, though I'd like to see how this looks when Marton's PR is merged.
Update: Checked back on that and I'd suggest modifying this to only really deal with the webhook tests themselves, not with installing a kafka clsuter and such as any real test case we might create which includes the webhook tests will include installing a kafka cluster and such, like it does now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think either we have to do everything from deploying components here or only do the webhooks and let the rest be built up at somewhere else, we are in the middle here which is not good because ensuring the kafka cluster is not the webhook test's responsibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can create an explicit or implicit dependency on the KafkaCluster CR name, remove the creation and cleanup steps and reorder the function calls in the suite file so that webhook testing comes right after one of the KafkaCluster creations we do.
One worry I voiced is about how such code scales for other webhooks and their CRs and their prerequisites and dependencies.
For example, explicit dependency injection (via function parameters like a KafkaClusterName) might mean we would keeeeep adding params to function definitions and that's not something we want either.
Or it could be "implicit" and we just "assume" certain constants are used for names - which we already do "here and there".
I'll explain my initial stance: I wanted to separate the setup steps for each webhook because I worried the prerequisites between webhooks could vary enough to make another type of implementation very messy. That's why the sequence of I am/was proposing was : testWebhooks() -> testWebhookKafkaTopic(kubectlOptions) -> -> testCreate + testUpdate -> teardown.
I can be convinced to, for instance, not have the setup phase as part of testWebhookKafkaTopic()
.
Later edit: as in to not have it at all and depend on resources installed in the main test suite function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can create an explicit or implicit dependency on the KafkaCluster CR name, remove the creation and cleanup steps and reorder the function calls in the suite file so that webhook testing comes right after one of the KafkaCluster creations we do.
Explicit rather, but otherwise this was my thought, yes.
I can be convinced to, for instance, not have the setup phase as part of testWebhookKafkaTopic().
Yeah with this naming IMO the setup/teardown of the components or clusters shouldn't be part of the function. Either we should name it testKafkaClusterCreateAndWebhookKafkaTopic()
or we need to make something else do the setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a couple of typos and left a question for my own understanding, no major issues. Good job!
tests/e2e/kafkatopic_webhook.go
Outdated
}) | ||
} | ||
|
||
func testWebhookKafkTopic(kubectlOptions k8s.KubectlOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo
func testWebhookKafkTopic(kubectlOptions k8s.KubectlOptions) { | |
func testWebhookKafkaTopic(kubectlOptions k8s.KubectlOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for catching this. I'll have to update it in the function definition and at the call site.
Done in 10b2248.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are using VSCode/VSCodium the extension CodeSpellChecker (streetsidesoftware.code-spell-checker) can catch this in editing time.
856ccc3
to
10b2248
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, couple comments, overall looks really good, thanks.
tests/e2e/koperator_suite_test.go
Outdated
@@ -64,6 +64,7 @@ var _ = When("Testing e2e test altogether", Ordered, func() { | |||
testInstallKafkaCluster("../../config/samples/simplekafkacluster_ssl.yaml") | |||
testProduceConsumeInternalSSL(defaultTLSSecretName) | |||
testUninstallKafkaCluster() | |||
testWebhooks() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Shouldn't this be at line 62? This sounds like something which is both easier to test and required sooner than what comes from line 62.
I know you discussed somewhat this with Dávid, I'm coming from a slightly different angle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current implementation where webhook prerequisites are decoupled, the logical trouble I saw was that webhook testing would end up doing the install testing for the big CRs before those dedicated tests happen.
I'll give the example we have regarding KafkaTopic : in order to test anything about KafkaTopic we need a KafkaCluster to be up which means we would end up testing KafkaCluster installation (without meaning to) before the dedicated KafkaCluster installation and functional tests.
I saw that as not good so I dedided to place webhook testing towards the end as a category of its own which can then rely on dedicated install and functional tests working (and having been addressed in their own sections beforehand).
I can be convinced otherwise, especially within the context of this conversation : #1031 (comment) .
tests/e2e/kafkatopic_webhook.go
Outdated
) | ||
|
||
// TODO(mihalexa): move to k8s.go | ||
// applyK8sResourceFromTemplateWithDryRun is copy of applyK8sResourceFromTemplate which calls a "--dry-run=<strategy>" kubectl command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a difficult topic.
I definitely wouldn't add a dryRunStrategy string
parameter to the original function.
But maybe we should pay the one time cost and add an extraArgs map[string][]string
parameter to that one which is passed down to the TerraTest call through the other helpers.
I know it's a big refactor and I can live with postponing it, but that looks like the desirable ideal solution to me.
tests/e2e/kafkatopic_webhook.go
Outdated
requireCreatingKafkaCluster(kubectlOptions, "../../config/samples/simplekafkacluster.yaml") | ||
testWebhookCreateKafkaTopic(kubectlOptions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think either we have to do everything from deploying components here or only do the webhooks and let the rest be built up at somewhere else, we are in the middle here which is not good because ensuring the kafka cluster is not the webhook test's responsibility.
tests/e2e/kafkatopic_webhook.go
Outdated
dryRunStrategyServer, | ||
) | ||
Expect(err).To(HaveOccurred()) | ||
// Example error: The KafkaTopic "topic-test-internal" is invalid: spec.clusterRef.name: Invalid value: "kafkaNOT": kafkaCluster 'kafkaNOT' in the namespace 'kafka' does not exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Is this really meaningful with line 108 being present? I see it as somewhat of a noise, but could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to show the messages in full and then match on noteworthy parts of them.
Also, in future refactors, someone may have a better matching logic idea if they see the "real" and full expected message/output.
Also on the point of consistency: if it's deemed useful for most other situations to have the "example error" comment, then I'd rather see it even in situations where it can look redundant on first glance, at the very least as an example of sticking to a developer "communication" practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In such cases my practice was to build the message clearly from 1 level of indirection if variables are defined - but mostly no variables for reuse in end to end tests for clarity purposes so string literals are the interface we are using.
I'm biased towards my experience but I can try and convince myself about this approach as well.
LGTM! Line 418 in a13805f
like here: func deleteK8sResource( |
94f0a6e
94f0a6e
to
d76b478
Compare
applyK8sResourceManifest(kubectlOptions, tempPath) | ||
err = applyK8sResourceManifest(kubectlOptions, tempPath) | ||
if err != nil { | ||
return errors.WrapIfWithDetails(err, "applying CRD failed", "crd", string(crd)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note:
When we use the emperror.dev/errors's WithDetails feature then IMHO the error message will not contain the details because it is stored in a separate structure member variable and not in the error message itself.
I think we should figure out something to solve this problem.
I favor the std Go error handling with the fmt.Errorf but I know the stacktrace will be not there. (it is also stored in a separate struct member variable in emperror case)
I know we use that lot of places in the e2e codebase
38e9f64
to
e0282d3
Compare
b66f9e3
to
274bc65
Compare
274bc65
to
d21676d
Compare
Description
This PR aims to provide an e2e test for the KafkaTopic validating webhook (for create and update operations).
I chose to implement this by using the
--dry-run=server
mode ofkubectl
as much as possible because that triggers all the api-server-side logic but without persisting the objects to storage (etcd) which is in exactly what we want in most cases during such tests. This way we have fewer objects to clean up and account/wait for.This mean I had to provide a local shortened replica of Terratest's
KubectlApplyFromStringE
because the function chain involved in the regular call to ourapplyK8sResourceFromTemplate()
does not allow passing...args
soon enough to be able to inject the--dry-run=server
I wanted.The chain in question is : applyK8sResourceFromTemplate -> applyK8sResourceManifestFromString -> now moving into terratest's k8s pkg -> KubectlApplyFromStringE -> KubectlApplyE (inserts "apply -f") -> RunKubectlE -> RunKubectlAndGetOutputE -> shell.RunCommandAndGetOutputE.
ℹ️
Update
operation tests look similar toCreate
ones but they are not identical neither in intent nor in output (there are some cases and errors specific only to Update). That is why I did not group them further in some common function.ℹ️ The reason for assertions like
Expect(len(strings.Split(output, "\n"))).To(Equal(1))
is that I intend to cause particular errors and test cases so I also want to match that little else happens outside of the targeted validation error.requireDeployKafkaTopic()
unusable outside of their initial context.The core issue at play here is that the pattern present in many tests like this one is incorrect.
Ginkgo runs in phases which means that at runtime, the code above doesn't quite execute in the order that it is written.
Doc reference : https://onsi.github.io/ginkgo/#mental-model-how-ginkgo-traverses-the-spec-hierarchy
To be precise :
What this means is that, in that current pattern, the
kubectlOptions
that is passed (by copy) to those functions is empty because the copying is done during theTree Construction Phase
(in aWhen
) but the useful value is populated inkubectlOptions
much later during theRun Phase
(in anIt
). This can be easily proven with debug/print statements.For this issue, during this PR, I only provided a temporary local fix so that the new tests can run and reuse the necessary functions. To some extent I'm asking you to overlook that part during review. It will get refactored after #1020.
ℹ️ I indicated some of my refactoring plans with comments including about moving some functions to other files. The reason for keeping as much as possible within a single file during this PR (but not forever!) is that #1020 is large and complicated and in case my PR gets merged first (on account of being far less complex), it will be significantly simpler to resolve conflicts or rebase #1020 if my PR mainly adds one noteworthy file (that I will break up later as indicated in the comments).
Type of Change
Checklist