title | layout |
---|---|
Migration from API server to CRDs |
docwithnav |
Upgrading Service Catalog from versions 0.2.x (and earlier) to 0.3.x requires a database migration. This document describes how the migration works and what actions must be performed.
NOTE: Before starting the migration, make sure that you have performed a full backup of your cluster. You should also test the procedure on a testing environment first.
The above picture describes changes in the Service Catalog architecture made between versions 0.2.0 and 0.3.0:
- Custom Resource Definitions (native K8S feature) are now used to store Service Catalog objects
- etcd and Aggregated API Server are no longer needed
- Webhook Server was added to perform data validation/mutation using the admission webhooks mechanism
The Service Catalog Helm release can be upgraded using the helm upgrade
command, which runs all necessary actions.
The upgrade to CRDs consists of the following steps:
- Make API Server read-only. Before any backup, we should block any resource changes to be sure the backup makes a snapshot. We need to avoid any changes when the migration tool is backing up resources.
- Check if Apiserver deployment with a given name exist. If deployment was not found we skip the migration.
- Scale down the Controller Manager to avoid resources processing, such as Secret deletion.
- Backup Service Catalog custom resources to files in a Persistent Volume.
- Remove
OwnerReference
fields in all Secrets pointed by any ServiceBinding. This is needed to avoid Secret deletion. - Remove all Service Catalog resources. This must be done if the Service Catalog uses the main Kubernetes etcd instance.
- Upgrade the Service Catalog: remove API Server, install CRDs, Webhook Server and roll up the Controller Manager.
- Scale down the Controller Manager to avoid any resource processing while applying resources.
- Restore all resources. The migration tool sets all necessary fields added in the Service Catalog 0.3.0. Creating resources triggers all logic implemented in webhooks so we can be sure all data are consistent. ServiceInstances are created and then updated because of class/plan references fields. The validation webhooks denies creating ServiceInstances if the reference to ClusterServiceClass or ServiceClass is not set in following fields: Spec.ClusterServiceClassRef, Spec.ClusterServicePlanRef, Spec.ServiceClassRef, Spec.ServicePlanRef. These fields are set during an update operation.
- Add proper
OwnerReference
to all Secrets pointed by ServiceBindings. - Scale up the Controller Manager.
NOTE: In step 6, there is no difference between Service Catalog upgrade using your own etcd or the main Kubernetes etcd.
Execute the backup
action to scale down the Controller Manager, remove owner references in Secrets and store all resources in a specified folder, then delete all Service Catalog resources.
./service-catalog migration --action backup --storage-path=data/ --service-catalog-namespace=catalog --controller-manager-deployment=catalog-catalog-controller-manager --apiserver-deployment=catalog-catalog-apiserver
Uninstall old Service Catalog and install the new one (version 0.3.0).
Execute restore action
to restore all resources and scale up the Controller Manager.
./service-catalog migration --action restore --storage-path=data/ --service-catalog-namespace=catalog --controller-manager-deployment=catalog-catalog-controller-manager
Migration tool is a set of helper functions integrated into the Service Catalog binary.
To run the migration tool, compile the service-catalog
binary by executing the following command:
make build
If you run the migration tool on OSX and want to get a native binary, add the PLATFORM
environment variable:
PLATFORM=darwin make build
The resulting executable file can be found in the bin
subdirectory.
CAUTION: You can safely remove the migration job PVC that contains data about your Service Catalog resources ONLY when the migration ends up successfully.
In order to perform a successful migration, the Service Catalog resources can't be in the unfinished or deleted state. Otherwise, the upgrade job can fail.
To check if your cluster is ready for migration, use the sanity check script.
The script checks if the Service Catalog resources are prepared for migration. If some of them are not ready, the script prints a proper error message.
There are a few possible messages you can see:
There are being deleted {type}
- prints the resources list of a given type with deletionTimestamp set.There are {type} in progress
- prints the resources list of a given type withasyncOpInProgress
set totrue
.ServiceClass not exist for the ServiceInstances:
- prints the Service Instances list which Service Class was deleted.
The above errors can be fixed manually, read more about it in this document.
You can run the service-catalog
binary with the migration
parameter which triggers the migration process. For example, run:
./service-catalog migration --action restore --storage-path=data/ --service-catalog-namespace=catalog --controller-manager-deployment=catalog-catalog-controller-manager
Flag | Description |
---|---|
action | Specifies the action which must be executed. The possible values are backup or restore . |
storage-path | Points to a folder where resources will be saved. |
service-catalog-namespace | Specifies the namespace in which the Service Catalog is installed. |
controller-manager-deployment | Provides the Controller Manager deployment name. |
apiserver-deployment | Provides the Apiserver deployment name. It is required only for the backup phase. |
In order to get a consistent backup, we have to make sure that no resources are modified during the backup process.
To achieve that, the migration tool creates ValidatingWebhookConfiguration
at the beginning of the backup process
to intercept and reject all attempts to mutate Service Catalog resources. Because of the limitation of the Aggregated API Server used in the previous version of Service Catalog, this webhook call fails with the following message:
failed calling webhook "validating.reject-changes-to-sc-crds.servicecatalog.k8s.io":
webhook does not accept v1beta1 AdmissionReviewRequest
This error message is presented in case of a modification or creation attempt of any Service Catalog resource during the backup process, and it means that the write protection works as expected.
To test the mutation blocking feature, execute the following commands:
- to enable the write protection:
./service-catalog migration --action deploy-blocker --service-catalog-namespace=default
- to disable the write protection:
./service-catalog migration --action undeploy-blocker --service-catalog-namespace=default
You can delete all the migration-related resources using this command:
kubectl delete clusterrole,clusterrolebinding,serviceaccount,job -n catalog -l migration-job=true
In case your migration job failed, you can check its logs using the following command:
kubectl logs -n catalog -l migration-job=true
In case you want to revert the upgrade, use the helm rollback
command which will restore the Service Catalog API Server version.
Before you proceed, you must delete all the Service Catalog resources and CRDs. You must also delete the resources that are not necessary for the Service Catalog API Server version. Use the following commands:
kubectl delete crd -l svcat=true
kubectl delete secret -n catalog catalog-catalog-webhook-cert
kubectl delete sa -n catalog service-catalog-webhook
kubectl delete sa -n catalog clean-job-account
Then you can execute the rollback using this command:
helm rollback catalog 1 --cleanup-on-fail --no-hooks
The migration job PVC with the data about your Service Catalog resources still exists after the rollback. You can perform the migration again to restore your resources.
To restore your Service Catalog resources, copy their data from the PVC onto your local machine using kubectl cp
command. Then, run the migration job again with the command:
./service-catalog migration --action restore --storage-path={PATH_TO_YOUR_RESOURCES} --service-catalog-namespace=catalog --controller-manager-deployment=catalog-catalog-controller-manager
TIP: Use a sanity check script to ensure that upgrade will succeeded and your resources will be restored