This lab walks through some basic best practices for operators using AKS. In many cases, the operations and developer best practices overlap.
- Complete previous labs:
This lab has a number of exercises in no particular order:
- Image Vulnerability Scanning
- Upgrade Kubernetes Regularly
- Process Node Updates and Reboots Using kured
- Enforce Resource Quotas
- Pod Disruption Budgets
- Network Security
- Provide Dedicated Nodes using Taints and Tolerations
- Using Role Based Access Control (RBAC)
- Pod Identities
- Backup and Business Continuity
- App Armor and seccomp Filtering
- Use kube-advisor to check for issues
-
It is critical to scan images for vulnerabilities in your environment. We recommending using a Enterprise grade tool such as Aqua Security or Twistlock
-
There are also numerous open source tools that provide basic image scanning for CVE's. https://opensource.com/article/18/8/tools-container-security
-
These tools should be integrated into the CI/CD pipeline, Container Regsitry, and container runtimes to provide end-to-end protection. Review full guidance here: https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-container-image-management
-
In our lab, we will use Anchore for some quick testing. https://anchore.com/opensource
-
Install anchore with helm
helm install --name anchore-demo stable/anchore-engine
Note: It may take a few minutes for all of the pods to start and for the CVE data to be loaded into the database.
-
Exec into the analyzer pod to access the CLI
kubectl get pod -l app=anchore-demo-anchore-engine -l component=analyzer NAME READY STATUS RESTARTS AGE anchore-demo-anchore-engine-analyzer-974d7479d-7nkgp 1/1 Running 0 2h
# now exec in kubectl exec -it anchore-demo-anchore-engine-analyzer-974d7479d-7nkgp bash
-
Set env variables to configure CLI (while exec'd into pod)
ANCHORE_CLI_USER=admin ANCHORE_CLI_PASS=foobar ANCHORE_CLI_URL=http://anchore-demo-anchore-engine-api.default.svc.cluster.local:8228/v1/
-
Check status (while exec'd into pod)
anchore-cli system status Service catalog (anchore-demo-anchore-engine-catalog-5fd7c96898-xq5nj, http://anchore-demo-anchore-engine-catalog:8082): up Service analyzer (anchore-demo-anchore-engine-analyzer-974d7479d-7nkgp, http://anchore-demo-anchore-engine-analyzer:8084): up Service apiext (anchore-demo-anchore-engine-api-7866dc7fcc-nk2l7, http://anchore-demo-anchore-engine-api:8228): up Service policy_engine (anchore-demo-anchore-engine-policy-578f59f48d-7bk9v, http://anchore-demo-anchore-engine-policy:8087): up Service simplequeue (anchore-demo-anchore-engine-simplequeue-5b5b89977c-nzg8r, http://anchore-demo-anchore-engine-simplequeue:8083): up Engine DB Version: 0.0.8 Engine Code Version: 0.3.1
-
Connect Anchore to ACR (you will need to set these variables since they are not in the container profile)
APPID= CLIENTSECRET= ACRNAME= anchore-cli registry add --registry-type docker_v2 $ACRNAME $APPID $CLIENTSECRET Registry: youracr.azurecr.io User: 59343209-9d9e-464d-8508-068a3d331fb9 Type: docker_v2 Verify TLS: True Created: 2019-01-10T17:56:05Z Updated: 2019-01-10T17:56:05Z
-
Add our images and check for issues
anchore-cli image add youracr.azurecr.io/hackfest/service-tracker-ui:1.0 anchore-cli image add youracr.azurecr.io/hackfest/data-api:1.0 anchore-cli image add youracr.azurecr.io/hackfest/flights-api:1.0 anchore-cli image add youracr.azurecr.io/hackfest/quakes-api:1.0 anchore-cli image add youracr.azurecr.io/hackfest/weather-api:1.0
# first wait for all images to be "analyzed" anchore-cli image list # then view results (there are none in these images thankfully) anchore-cli image vuln youracr.azurecr.io/hackfest/service-tracker-ui:1.0 all # show os packages anchore-cli image content youracr.azurecr.io/hackfest/service-tracker-ui:1.0 os
-
-
To stay current on new features and bug fixes, regularly upgrade to the Kubernetes version in your AKS cluster.
az aks get-upgrades --resource-group $RGNAME --name $CLUSTERNAME
# change the image version below based on your cluster (if upgrade is available) KUBERNETES_VERSION=1.11.5 az aks upgrade --resource-group $RGNAME --name $CLUSTERNAME --kubernetes-version $KUBERNETES_VERSION
AKS automatically downloads and installs security fixes on each of the worker nodes, but does not automatically reboot if necessary.
The open-source kured (KUbernetes REboot Daemon) project by Weaveworks watches for pending node reboots. When a node applies updates that require a reboot, the node is safely cordoned and drained to move and schedule the pods on other nodes in the cluster.
-
Deploy kured into the AKS cluster
kubectl apply -f https://github.com/weaveworks/kured/releases/download/1.1.0/kured-1.1.0.yaml
-
Check status. Since kured is installed with a DaemonSet, you will see one kured pod per node in your cluster.
kubectl get pod -n kube-system -l name=kured NAME READY STATUS RESTARTS AGE kured-22s4v 1/1 Running 0 1m kured-5hbqz 1/1 Running 0 1m kured-ljsv4 1/1 Running 0 1m kured-mnkcl 1/1 Running 0 1m kured-wv9pp 1/1 Running 0 1m
-
You're done. Re-boots will happen automatically if security updates are applied.
- You can force an re-boot as described here.
In our first lab, we introduced resource quotas with namespaces. You can review those steps again here.
We can use "pod disruption budgets" to make sure a minimum number of pods are available. These pod disruption budgets can help ensure availability during voluntary updates to our deployments such as container image upgrades, etc.
-
First, scale out a deployment to create more pods for the test
kubectl scale deployment service-tracker-ui -n hackfest --replicas=4
-
Create a pod disruption budget
Review the spec and then deploy:
apiVersion: policy/v1beta1 kind: PodDisruptionBudget metadata: name: service-tracker-pdb spec: minAvailable: 2 selector: matchLabels: app: service-tracker-ui
kubectl apply -f ~/kubernetes-hackfest/labs/best-practices/operators/pod-disruption-budget.yaml -n hackfest
-
Create a new version of the service-tracker-ui
az acr build -t hackfest/service-tracker-ui:newversion -r $ACRNAME --no-logs ~/kubernetes-hackfest/app/service-tracker-ui
-
Watch the pods in the namespace
watch kubectl get pod -n hackfest
-
Perform the upgrade (you will need a new terminal/cloud shell session for this)
kubectl set image deployment/service-tracker-ui service-tracker-ui=briaracr.azurecr.io/hackfest/service-tracker-ui:newversion -n hackfest
-
Based on our PDB, you should see 2 pods running at all times while the pods are being updated to the new image
-
Create a service account and role binding
kubectl apply -f ~/kubernetes-hackfest/labs/best-practices/operators/sa-kube-advisor.yaml
-
Create the pod
kubectl run --rm -i -t kube-advisor --image=mcr.microsoft.com/aks/kubeadvisor --restart=Never --overrides="{ \"apiVersion\": \"v1\", \"spec\": { \"serviceAccountName\": \"kube-advisor\" } }"
-
Review results