-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
E2E CI Test for Operator Bundle #28
E2E CI Test for Operator Bundle #28
Conversation
ef0d8a4
to
8caa8ca
Compare
/hold Should merge after #30 |
8caa8ca
to
af2594e
Compare
how hard would it be to enable either PSP or the PSP follow on to turn on enforcement in our kind cluster to that running a non-root is proven out? If it is too much to take on with this PR, can you open an issue to track doing that longer term? |
@gabemontero are you referring to PodSecurity Policy (deprecated) or the new PodSecurity admission plugin (which has enforcement mechanisms for the new Pod Security Standards). For the latter, I would prefer we add it in a follow-up PR, as the plugin is only available in k8s 1.22. |
Referring to both. You could either
If you do 1) now, great. If you want to open a tracking item to do 1) at some point in the future, outside of this PR, ok, or If you open a tracking item to do 2), ok As long as one of those 3 possible actions is taken, I'm good |
I don't think we need PodSecurityPolicy to ensure that we don't run pods as root - I can file an issue to enable the PodSecurity plugin when we upgrade to 1.22, and furthermore enforce the |
Filed #33 |
/hold cancel #30 merged. |
/approve Self-approving |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adambkaplan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use the same tools across all projects in the Shipwright organization, as in the hack scripts we need for KinD, local container registry and such. Currently we are duplicating those scripts on each new interaction, with sightly differences on each place, making the future maintenance less predictable.
As well, I miss having the Makefile
as the "entrypoint" for (almost) all automation that we need in a given project. The Makefile
acts as "source of authority" for automation, keeps the environment variables under control, and serves a place to document and invoke our automation inventory. Likewise, we should try to have the same Makefile
targets across all projects in the organization, as much as possible.
So, I think we should find means to avoid duplicating our CI scripts, and make sure the Makefile
is the concise place to trigger automation in the project. That could be something this PR could tackle, or even a future improvement too.
af2594e
to
0285982
Compare
I agree that this will become a challenge over time. It sounds like we are ready to build some test-infra tooling! |
Filed shipwright-io/community#44 to discuss how to approach common tooling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments on scripting, but other than that it looks really good 👍🏼
hack/run-operator-catalog.sh
Outdated
attempts=1 | ||
while [[ ${attempts} -le 10 ]]; do | ||
echo "Checking the status of the operator rollout - attempt ${attempts}" | ||
if ${k8s} rollout status deployment "${namePrefix}operator" -n "${subNamespace}"; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you can use the option --timeout
and let Kubernetes client handle the retries for you. We do the same on the CLI, please consider.
With the timeout option in place, we may also simplify the script not having to handle the attempt for-loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to use a single function to wait on pod status.
fef698a
to
8dd6d1b
Compare
hack/run-operator-catalog.sh
Outdated
# Pod may not exist, in which case wait 30 seconds and try again | ||
${KUBECTL_BIN} wait --for=condition=Ready pod -l "${label}" -n "${namespace}" --timeout "${timeout}" || \ | ||
sleep 30 && ${KUBECTL_BIN} wait --for=condition=Ready pod -l "${label}" -n "${namespace}" --timeout "${timeout}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 30 second gap before the retry is needed because OLM does a lot of work to deploy the operator, and no pods may exist when the initial wait call is made.
8dd6d1b
to
fdcbe7d
Compare
bump @otaviof |
test/kind/verify-kind.sh
Outdated
|
||
echo "# Using KinD context..." | ||
${KUBECTL_BIN} config use-context "kind-kind" | ||
cho "# KinD nodes:" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we lost the letter e
on echo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - and not adding set -e
here let it pass through 🤦
fdcbe7d
to
4050bbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, very good additions to this project!
/lgtm
This change augments the e2e test suite to simulate the operator's deployment using OLM. The setup consists of the following components: - A docker/distribution container registry, running in docker outside of any Kubernetes cluster. - A KinD cluster that is configured to resolve image refs which use "localhost" as the image registry domain. - Installing OLM on the KinD cluster. Once set up, the operator and its associated OLM bundle are built and pushed to the local container registry. Next, an OLM catalog is built based on the catalog published in operatorhub.io. The catalog is what allows OLM to find the Tekton operator that Shipwright depends on, and is likewise pushed to the local container registry. Building the operator, bundle, and catalog with a fully on-cluster registry is problematic for several reasons: - Not all tools can push to the on-cluster registry in this fashion - Mainfests need to be rewritten to reference the on-cluster DNS name for the registry - The catalog source needs to be pullable within the cluster. The test runs as follows: - Create a namespace to run the operator under test - Create a CatalogSource using the catalog containing the operator under test. - Create an OperatorGroup which allows AllNamespace operators to be installed in the given namespace. - Create a Subscription to install the Shipwright operator and its associated Tekton operator. - Verify that the shipwright operator deploys successfully. Contributor documentation has also been updated so that developers can run this process using make commands on their Kubernetes cluster of choice. See also: - https://kind.sigs.k8s.io/docs/user/local-registry/ - https://olm.operatorframework.io/docs/tasks/creating-a-catalog/ - https://olm.operatorframework.io/docs/tasks/make-catalog-available-on-cluster/ - https://olm.operatorframework.io/docs/tasks/install-operator-with-olm/ - https://olm.operatorframework.io/docs/advanced-tasks/operator-scoping-with-operatorgroups/
Use the `operatorhub/catalog_sa` image as the base for the catalog index. The default operatorhub catalog appears to have a root-owned file that causes `opm index add` to fail. See operator-framework/operator-registry#870
o.MatchError will panic if a k8s NotFound error is returned. This is fixed by checking that a NotFound error is raised separate from a gomega match.
4050bbc
to
dd409c0
Compare
New changes are detected. LGTM label has been removed. |
Re-tagged @otaviof 's lgtm (I needed to push a minor fix that caused tests to break). |
Changes
This change augments the e2e test suite to simulate the operator's deployment using OLM.
The setup consists of the following components:
Once set up, the operator and its associated OLM bundle are built and pushed to the local container registry.
Next, an OLM catalog is built based on the catalog published in operatorhub.io.
The catalog is what allows OLM to find the Tekton operator that Shipwright depends on, and is likewise pushed to the local container registry.
Building the operator, bundle, and catalog with a fully on-cluster registry is problematic for several reasons:
The test runs as follows:
See also:
/kind cleanup
Submitter Checklist
See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.
Release Notes