Skip to content

Commit

Permalink
Add support for testing aws-hosted-cp
Browse files Browse the repository at this point in the history
* Break KubeClient helpers into provider specific file.
* Try to simplify the validation process for lots of different providers
  with different requirements.
* Finish aws-hosted-cp test and add comments through test to make it
  easier to understand.
* Use GinkgoHelper across e2e tests, populate hosted vars from
  AWSCluster.
* No longer rely on local registry for images in test/e2e.
* Support OS for awscli install.
* Prepend hostname to collected log artifacts.
* Support no cleanup of provider specs, differentiate ci
  cluster names.
* Add docs on running tests, do not wait for all providers
  if configured.
* Reinstantiate resource validation map on each instance of
  validation.
* Enable the external-gc feature via annotation, featureGate
  bool. (Closes: #152)
* Bump aws-*-cp templates to 0.1.3
* Bump cluster-api-provider-aws template to 0.1.2
* Improve test logging to log template name and validation
  phase.
* Bump k0s version to v1.30.4+k0s.0, set CCM nodeSelector to
  null for aws-hosted-cp. (Closes: #290)
* Break cleanup into seperate job so that it is unaffected by
  concurrency group cancellations.
* Make dev-aws-nuke target less PHONY.

Closes: #212

Signed-off-by: Kyle Squizzato <[email protected]>
  • Loading branch information
squizzi committed Sep 16, 2024
1 parent b05008b commit d590bfb
Show file tree
Hide file tree
Showing 31 changed files with 960 additions and 423 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ env:

jobs:
build:
name: Build and Unit Test
name: Build and Test
runs-on: ubuntu-latest
steps:
- name: Checkout repository
Expand Down
85 changes: 70 additions & 15 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,7 @@
name: E2E Tests

concurrency:
group: test-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

on:
pull_request_target:
types: [labeled]
pull_request:
branches:
- main
- release-*
Expand All @@ -15,31 +10,91 @@ on:
- '**.md'
env:
GO_VERSION: '1.22'
AWS_REGION: us-west-2
AWS_ACCESS_KEY_ID: ${{ secrets.CI_AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.CI_AWS_SECRET_ACCESS_KEY }}

jobs:
e2etest:
concurrency:
group: test-e2e-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
name: E2E Tests
runs-on: ubuntu-latest
if: contains(github.event.pull_request.labels.*.name, 'test-e2e')
env:
AWS_REGION: us-west-2
AWS_ACCESS_KEY_ID: ${{ secrets.CI_AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.CI_AWS_SECRET_ACCESS_KEY }}
outputs:
clustername: ${{ steps.vars.outputs.clustername }}
version: ${{ steps.vars.outputs.version }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
go-version: ${{ env.GO_VERSION }}
- name: Set up Buildx
uses: docker/setup-buildx-action@v3
- name: Login to GHCR
uses: docker/[email protected]
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Get outputs
id: vars
run: |
echo "version=$(git describe --tags --always)" >> $GITHUB_OUTPUT
echo "clustername=ci-$(date +%s)-e2e-test" >> $GITHUB_OUTPUT
- name: Build and push HMC controller image
uses: docker/build-push-action@v6
with:
build-args: |
LD_FLAGS=-s -w -X github.com/Mirantis/hmc/internal/build.Version=${{ steps.vars.outputs.version }}
context: .
platforms: linux/amd64,linux/arm64
tags: |
ghcr.io/mirantis/hmc/controller-ci:${{ steps.vars.outputs.version }}
push: true
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Prepare and push HMC template charts
run: |
make hmc-chart-release
REGISTRY_REPO="oci://ghcr.io/mirantis/hmc/charts-ci" make helm-push
- name: Setup kubectl
uses: azure/setup-kubectl@v4
- name: Run E2E tests
env:
MANAGED_CLUSTER_NAME: ${{ steps.vars.outputs.clustername }}
REGISTRY_REPO: 'oci://ghcr.io/mirantis/hmc/charts-ci'
IMG: 'ghcr.io/mirantis/hmc/controller-ci:${{ steps.vars.outputs.version }}'
run: |
make test-e2e
- name: Archive test results
if: ${{ failure() }}
uses: actions/upload-artifact@v4
with:
name: test-logs
path: |
test/e2e/*.log
name: test-logs
path: |
test/e2e/*.log
cleanup:
name: Cleanup
needs: e2etest
runs-on: ubuntu-latest
if: ${{ always() }}
timeout-minutes: 15
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
- name: AWS Test Resources
env:
CLUSTER_NAME: '${{ needs.e2etest.outputs.clustername }}'
run: |
make dev-aws-nuke
45 changes: 34 additions & 11 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ NAMESPACE ?= hmc-system
VERSION ?= $(shell git describe --tags --always)
# Image URL to use all building/pushing image targets
IMG ?= hmc/controller:latest
IMG_REPO = $(shell echo $(IMG) | cut -d: -f1)
IMG_TAG = $(shell echo $(IMG) | cut -d: -f2)
# ENVTEST_K8S_VERSION refers to the version of kubebuilder assets to be downloaded by envtest binary.
ENVTEST_K8S_VERSION = 1.29.0

Expand Down Expand Up @@ -103,10 +105,11 @@ tidy:
test: generate-all fmt vet envtest tidy external-crd ## Run tests.
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" go test $$(go list ./... | grep -v /e2e) -coverprofile cover.out

# Utilize Kind or modify the e2e tests to load the image locally, enabling compatibility with other vendors.
.PHONY: test-e2e # Run the e2e tests against a Kind k8s instance that is spun up.
# Utilize Kind or modify the e2e tests to load the image locally, enabling
# compatibility with other vendors.
.PHONY: test-e2e # Run the e2e tests using a Kind k8s instance as the management cluster.
test-e2e: cli-install
KIND_CLUSTER_NAME="hmc-test" KIND_VERSION=$(KIND_VERSION) go test ./test/e2e/ -v -ginkgo.v -timeout=2h
KIND_CLUSTER_NAME="hmc-test" KIND_VERSION=$(KIND_VERSION) go test ./test/e2e/ -v -ginkgo.v -timeout=2h

.PHONY: lint
lint: golangci-lint ## Run golangci-lint linter & yamllint
Expand Down Expand Up @@ -240,6 +243,13 @@ hmc-deploy: helm

.PHONY: dev-deploy
dev-deploy: ## Deploy HMC helm chart to the K8s cluster specified in ~/.kube/config.
@$(YQ) eval -i '.image.repository = "$(IMG_REPO)"' config/dev/hmc_values.yaml
@$(YQ) eval -i '.image.tag = "$(IMG_TAG)"' config/dev/hmc_values.yaml
@if [ "$(REGISTRY_REPO)" = "oci://127.0.0.1:$(REGISTRY_PORT)/charts" ]; then \
$(YQ) eval -i '.controller.defaultRegistryURL = "oci://$(REGISTRY_NAME):5000/charts"' config/dev/hmc_values.yaml; \
else \
$(YQ) eval -i '.controller.defaultRegistryURL = "$(REGISTRY_REPO)"' config/dev/hmc_values.yaml; \
fi; \
$(MAKE) hmc-deploy HMC_VALUES=config/dev/hmc_values.yaml
$(KUBECTL) rollout restart -n $(NAMESPACE) deployment/hmc-controller-manager

Expand Down Expand Up @@ -317,15 +327,16 @@ dev-mcluster-delete: envsubst
.PHONY: dev-creds-apply
dev-creds-apply: dev-$(DEV_PROVIDER)-creds

.PHONY: envsubst awscli dev-aws-nuke
dev-aws-nuke: ## Warning: Destructive! Nuke all AWS resources deployed by 'DEV_PROVIDER=aws dev-provider-apply', prefix with CLUSTER_NAME to nuke a specific cluster.
.PHONY: dev-aws-nuke
dev-aws-nuke: envsubst awscli yq cloud-nuke ## Warning: Destructive! Nuke all AWS resources deployed by 'DEV_PROVIDER=aws dev-provider-apply', prefix with CLUSTER_NAME to nuke a specific cluster.
@CLUSTER_NAME=$(CLUSTER_NAME) YQ=$(YQ) AWSCLI=$(AWSCLI) bash -c "./scripts/aws-nuke-ccm.sh elb"
@CLUSTER_NAME=$(CLUSTER_NAME) $(ENVSUBST) < config/dev/cloud_nuke.yaml.tpl > config/dev/cloud_nuke.yaml
DISABLE_TELEMETRY=true $(CLOUDNUKE) aws --region $$AWS_REGION --force --config config/dev/cloud_nuke.yaml --resource-type vpc,eip,nat-gateway,ec2-subnet,elb,elbv2,internet-gateway,network-interface,security-group
DISABLE_TELEMETRY=true $(CLOUDNUKE) aws --region $$AWS_REGION --force --config config/dev/cloud_nuke.yaml --resource-type vpc,eip,nat-gateway,ec2,ec2-subnet,elb,elbv2,ebs,internet-gateway,network-interface,security-group
@rm config/dev/cloud_nuke.yaml
@CLUSTER_NAME=$(CLUSTER_NAME) YQ=$(YQ) AWSCLI=$(AWSCLI) bash -c ./scripts/aws-nuke-ccm.sh
@CLUSTER_NAME=$(CLUSTER_NAME) YQ=$(YQ) AWSCLI=$(AWSCLI) bash -c "./scripts/aws-nuke-ccm.sh ebs"

.PHONY: cli-install
cli-install: clusterawsadm clusterctl cloud-nuke yq awscli ## Install the necessary CLI tools for deployment, development and testing.
cli-install: clusterawsadm clusterctl cloud-nuke envsubst yq awscli ## Install the necessary CLI tools for deployment, development and testing.

##@ Dependencies

Expand Down Expand Up @@ -450,9 +461,21 @@ $(ENVSUBST): | $(LOCALBIN)
.PHONY: awscli
awscli: $(AWSCLI)
$(AWSCLI): | $(LOCALBIN)
curl "https://awscli.amazonaws.com/awscli-exe-$(OS)-$(shell uname -m)-$(AWSCLI_VERSION).zip" -o "/tmp/awscliv2.zip"
unzip /tmp/awscliv2.zip -d /tmp
/tmp/aws/install -i $(LOCALBIN)/aws-cli -b $(LOCALBIN) --update
@if [ $(OS) == "linux" ]; then \
curl "https://awscli.amazonaws.com/awscli-exe-linux-$(shell uname -m)-$(AWSCLI_VERSION).zip" -o "/tmp/awscliv2.zip"; \
unzip /tmp/awscliv2.zip -d /tmp; \
/tmp/aws/install -i $(LOCALBIN)/aws-cli -b $(LOCALBIN) --update; \
fi; \
if [ $(OS) == "darwin" ]; then \
curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"; \
installer -pkg AWSCLIV2.pkg -target $(LOCALBIN) -applyChoiceChangesXML choices.xml; \
rm AWSCLIV2.pkg; \
fi; \
if [ $(OS) == "windows" ]; then \
echo "Installing to $(LOCALBIN) on Windows is not yet implemented"; \
exit 1; \
fi; \


# go-install-tool will 'go install' any package with custom target and name of binary, if it doesn't exist
# $1 - target path with name of binary (ideally with version)
Expand Down
30 changes: 28 additions & 2 deletions docs/aws/hosted-control-plane.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,12 @@ reused with a management cluster.
If you deployed your AWS Kubernetes cluster using Cluster API Provider AWS (CAPA)
you can obtain all the necessary data with the commands below or use the
template found below in the
[HMC ManagedCluster manifest generation](#hmc-managed-cluster-manifest-generation) section.
[HMC ManagedCluster manifest
generation](#hmc-managed-cluster-manifest-generation) section.

If using the `aws-standalone-cp` template to deploy a hosted cluster it is
recommended to use a `t3.large` or larger instance type as the `hmc-controller`
and other provider controllers will need a large amount of resources to run.

**VPC ID**

Expand Down Expand Up @@ -89,7 +94,7 @@ Grab the following `ManagedCluster` manifest template and save it to a file name
apiVersion: hmc.mirantis.com/v1alpha1
kind: ManagedCluster
metadata:
name: aws-hosted-cp
name: aws-hosted
spec:
template: aws-hosted-cp
config:
Expand All @@ -109,3 +114,24 @@ Then run the following command to create the `managedcluster.yaml`:
```
kubectl get awscluster cluster -o go-template="$(cat managedcluster.yaml.tpl)" > managedcluster.yaml
```
## Deployment Tips
* Ensure HMC templates and the controller image are somewhere public and
fetchable.
* For installing the HMC charts and templates from a custom repository, load
the `kubeconfig` from the cluster and run the commands:

```
KUBECONFIG=kubeconfig IMG="ghcr.io/mirantis/hmc/controller-ci:v0.0.1-179-ga5bdf29" REGISTRY_REPO="oci://ghcr.io/mirantis/hmc/charts-ci" make dev-apply
KUBECONFIG=kubeconfig make dev-templates
```
* The infrastructure will need to manually be marked `Ready` to get the
`MachineDeployment` to scale up. You can patch the `AWSCluster` kind using
the command:
```
KUBECONFIG=kubeconfig kubectl patch AWSCluster <hosted-cluster-name> --type=merge --subresource status --patch 'status: {ready: true}' -n hmc-system
```
For additional information on why this is required [click here](https://docs.k0smotron.io/stable/capi-aws/#:~:text=As%20we%20are%20using%20self%2Dmanaged%20infrastructure%20we%20need%20to%20manually%20mark%20the%20infrastructure%20ready.%20This%20can%20be%20accomplished%20using%20the%20following%20command).
31 changes: 31 additions & 0 deletions docs/dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,3 +107,34 @@ export KUBECONFIG=~/.kube/config
kubectl --kubeconfig ~/.kube/config get secret -n hmc-system <managedcluster-name>-kubeconfig -o=jsonpath={.data.value} | base64 -d > kubeconfig
```

## Running E2E tests locally
E2E tests can be ran locally via the `make test-e2e` target. In order to have
CI properly deploy a non-local registry will need to be used and the Helm charts
and hmc-controller image will need to exist on the registry, for example, using
GHCR:

```
IMG="ghcr.io/mirantis/hmc/controller-ci:v0.0.1-179-ga5bdf29" \
REGISTRY_REPO="oci://ghcr.io/mirantis/hmc/charts-ci" \
make test-e2e
```

Optionally, the `NO_CLEANUP=1` env var can be used to disable `After` nodes from
running within some specs, this will allow users to debug tests by re-running
them without the need to wait a while for an infrastructure deployment to occur.
For subsequent runs the `MANAGED_CLUSTER_NAME=<cluster name>` env var should be
passed to tell the test what cluster name to use so that it does not try to
generate a new name and deploy a new cluster.

Tests that run locally use autogenerated names like `12345678-e2e-test` while
tests that run in CI use names such as `ci-1234567890-e2e-test`. You can always
pass `MANAGED_CLUSTER_NAME=` from the get-go to customize the name used by the
test.

### Nuke created resources
In CI we run `make dev-aws-nuke` to cleanup test resources, you can do so
manually with:

```
CLUSTER_NAME=example-e2e-test make dev-aws-nuke
```
50 changes: 26 additions & 24 deletions scripts/aws-nuke-ccm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,28 +33,30 @@ if [ -z $AWSCLI ]; then
exit 1
fi

echo "Checking for ELB with 'kubernetes.io/cluster/$CLUSTER_NAME' tag"
for LOADBALANCER in $($AWSCLI elb describe-load-balancers --output yaml | $YQ '.LoadBalancerDescriptions[].LoadBalancerName');
do
echo "Checking ELB: $LOADBALANCER for 'kubernetes.io/cluster/$CLUSTER_NAME tag"
DESCRIBE_TAGS=$($AWSCLI elb describe-tags \
--load-balancer-names $LOADBALANCER \
--output yaml | $YQ '.TagDescriptions[].Tags.[]' | grep 'kubernetes.io/cluster/$CLUSTER_NAME')
if [ ! -z "${DESCRIBE_TAGS}" ]; then
echo "Deleting ELB: $LOADBALANCER"
$AWSCLI elb delete-load-balancer --load-balancer-name $LOADBALANCER
fi
done
if [ "$1" == "elb" ]; then
echo "Checking for ELB with '$CLUSTER_NAME' tag"
for LOADBALANCER in $($AWSCLI elb describe-load-balancers --output yaml | $YQ '.LoadBalancerDescriptions[].LoadBalancerName');
do
echo "Checking ELB: $LOADBALANCER for tag"
DESCRIBE_TAGS=$($AWSCLI elb describe-tags --load-balancer-names $LOADBALANCER --output yaml | $YQ '.TagDescriptions[]' | grep $CLUSTER_NAME)
if [ ! -z "${DESCRIBE_TAGS}" ]; then
echo "Deleting ELB: $LOADBALANCER"
$AWSCLI elb delete-load-balancer --load-balancer-name $LOADBALANCER
fi
done
fi

echo "Checking for EBS Volumes with $CLUSTER_NAME within the 'kubernetes.io/created-for/pvc/name' tag"
for VOLUME in $($AWSCLI ec2 describe-volumes --output yaml | $YQ '.Volumes[].VolumeId');
do
echo "Checking EBS Volume: $VOLUME for $CLUSTER_NAME claim"
DESCRIBE_VOLUMES=$($AWSCLI ec2 describe-volumes \
--volume-id $VOLUME \
--output yaml | $YQ '.Volumes | to_entries[] | .value.Tags[] | select(.Key == "kubernetes.io/created-for/pvc/name")' | grep $CLUSTER_NAME)
if [ ! -z "${DESCRIBE_VOLUMES}" ]; then
echo "Deleting EBS Volume: $VOLUME"
$AWSCLI ec2 delete-volume --volume-id $VOLUME
fi
done
if [ "$1" == "ebs" ]; then
echo "Checking for EBS Volumes with '$CLUSTER_NAME' within the 'kubernetes.io/created-for/pvc/name' tag"
for VOLUME in $($AWSCLI ec2 describe-volumes --output yaml | $YQ '.Volumes[].VolumeId');
do
echo "Checking EBS Volume: $VOLUME for $CLUSTER_NAME claim"
DESCRIBE_VOLUMES=$($AWSCLI ec2 describe-volumes \
--volume-id $VOLUME \
--output yaml | $YQ '.Volumes | to_entries[] | .value.Tags[] | select(.Key == "kubernetes.io/created-for/pvc/name")' | grep $CLUSTER_NAME)
if [ ! -z "${DESCRIBE_VOLUMES}" ]; then
echo "Deleting EBS Volume: $VOLUME"
$AWSCLI ec2 delete-volume --volume-id $VOLUME
fi
done
fi
6 changes: 3 additions & 3 deletions templates/cluster/aws-hosted-cp/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
apiVersion: v2
name: aws-hosted-cp
description: |
description: |
An HMC template to deploy a k8s cluster on AWS with control plane components
within the management cluster.
type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.2
version: 0.1.3
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.30.2+k0s.0"
appVersion: '1.30.4+k0s.0'
annotations:
hmc.mirantis.com/infrastructure-providers: aws
hmc.mirantis.com/controlplane-providers: k0smotron
Expand Down
1 change: 1 addition & 0 deletions templates/cluster/aws-hosted-cp/templates/awscluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ metadata:
name: {{ include "cluster.name" . }}
annotations:
cluster.x-k8s.io/managed-by: k0smotron
aws.cluster.x-k8s.io/external-resource-gc: "true"
finalizers:
- hmc.mirantis.com/cleanup
spec:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ spec:
- --cluster-name={{ include "cluster.name" . }}
# Removing the default `node-role.kubernetes.io/control-plane` node selector
# TODO: it does not work
# nodeSelector: ""
nodeSelector:
node-role.kubernetes.io/control-plane: null
- name: aws-ebs-csi-driver
namespace: kube-system
chartname: aws-ebs-csi-driver/aws-ebs-csi-driver
Expand Down
Loading

0 comments on commit d590bfb

Please sign in to comment.