This document reviews how to install and make use of the driver.
Ensure the following information and requirements can be met prior to installation.
-
The following ZFS Storage Appliance information (see your ZFFSA device administrator):
- The name or the IP address of the appliance. If it's a name, it must be DNS resolvable. When the appliance is a clustered system, the connection for management operations is tied to a head in driver deployment. You will see different driver behaviors in takeover/failback scenarios with the target storage appliance depending on the management interface settings and if it remains locked to the failed node or not.
- A login access to your ZFSSA in the form of a user login and associated password. It is desirable to create a normal login user with required authorizations.
- The appliance certificate for the REST endpoint is available.
- The name of the appliance storage pool from which volumes will be provisioned.
- The name of the project in the pool.
- In secure mode, the driver supports only TLSv1.2 for HTTPS connection to ZFSSA. Make sure that TLSv1.2 is enabled for HTTPS service on ZFSSA.
The user on the appliance must have a minimum of the following authorizations (where pool and project are those that will be used in the storage class), root should not be used for provisioning.
- Object: nas...*
- Permissions:
- changeAccessProps
- changeGeneralProps
- changeProtocolProps
- changeSpaceProps
- changeUserQuota
- clearLocks
- clone
- createShare
- destroy
- rollback
- scheduleSnap
- takeSnap
- destroySnap
- renameSnap
- Permissions:
The File system being exported must have 'Share Mode' set to 'Read/Write' in the section 'NFS' of the tab 'Protocol' of the file system (Under 'Shares').
More than one pool/project are possible if there are storage classes that identify different pools and projects.
-
The Kubernetes cluster namespace you must use (see your cluster administrator)
-
Sidecar images
Make sure you have access to the registry or registries containing these images from the worker nodes. The image pull policy (
imagePullPolicy
) is set toIfNotPresent
in the deployment files. During the first deployment the Container Runtime will likely try to pull them. If your Container Runtime cannot access the images you will have to pull them manually before deployment. The required images are:- node-driver-registar v2.9.0+.
- external-attacher v4.4.0+.
- external-provisioner v3.6.0+.
- external-resizer v1.9.0+.
- external-snapshotter v6.3.0+.
- livenessprobe v2.11.0
- snapshot-controller v6.3.0
The current deployment uses the sidecar images built by Oracle and available from the Oracle Container Registry (container-registry.oracle.com/olcne/). Refer to the current deployment for more information.
Sidecars are also available from the Kubernetes team.
-
Plugin image
You can pull the plugin image from a registry that you know hosts it or you can generate it and store it in one of your registries. In any case, as for the sidecar images, the Container Runtime must have access to that registry. If not you will have to pull it manually before deployment. If you choose to generate the plugin yourself use version 1.21.0 or above of the Go compiler.
This volume driver supports both NFS (filesystem) and iSCSI (block) volumes. Preparation for iSCSI, at this time, will take some setup, please see the information below.
Install iSCSI client utilities on the Kubernetes worker nodes:
$ yum install iscsi-initiator-utils -y
Verify iscsid
and iscsi
are running after installation (systemctl status iscsid iscsi).
- Create an initiator group on the Oracle ZFS Storage Appliance per worker node name. For example, if
your worker node name is
pmonday-olcne-worker-0
, then there should be an initiator group namedpmonday-olcne-worker-0
on the target appliance with the IQN of the worker node. The initiator can be determined by looking at/etc/iscsi/initiatorname.iscsi
. - Create one or more targets and target groups on the interface that you intend to use for iSCSI traffic.
- CHAP is not supported at this time.
- Cloud instances often have duplicate IQNs, these MUST be regenerated and unique or connection storms happen (Instructions).
- There are cases where fresh instances do not start the iscsi service properly with the following, modify the iscsi.service to remove the ConditionDirectoryNotEmpty temporarily
Condition: start condition failed at Wed 2020-10-28 18:37:35 GMT; 1 day 4h ago
ConditionDirectoryNotEmpty=/var/lib/iscsi/nodes was not met
-
iSCSI may get timeouts in particular networking conditions. Review the following web pages for possible solutions. The first involves modifying sysctl, the second involves changing the replacement timeout for iSCSI.
-
There is a condition where a 'uefi' target creates noise in iscsi discovery, this is noticeable in the iscsid output (systemctl status iscsid). This issue appears to be in Oracle Linux 7 in a virtualized environment:
● iscsid.service - Open-iSCSI
Loaded: loaded (/usr/lib/systemd/system/iscsid.service; disabled; vendor preset: disabled)
Active: active (running) since Wed 2020-10-28 17:30:17 GMT; 1 day 23h ago
Docs: man:iscsid(8)
man:iscsiuio(8)
man:iscsiadm(8)
Main PID: 1632 (iscsid)
Status: "Ready to process requests"
Tasks: 1
Memory: 6.4M
CGroup: /system.slice/iscsid.service
└─1632 /sbin/iscsid -f -d2
Oct 30 16:23:02 pbm-kube-0-w1 iscsid[1632]: iscsid: disconnecting conn 0x56483ca0f050, fd 7
Oct 30 16:23:02 pbm-kube-0-w1 iscsid[1632]: iscsid: connecting to 169.254.0.2:3260
Oct 30 16:23:02 pbm-kube-0-w1 iscsid[1632]: iscsid: connect to 169.254.0.2:3260 failed (Connection refused)
Oct 30 16:23:02 pbm-kube-0-w1 iscsid[1632]: iscsid: deleting a scheduled/waiting thread!
Oct 30 16:23:03 pbm-kube-0-w1 iscsid[1632]: iscsid: Poll was woken by an alarm
Oct 30 16:23:03 pbm-kube-0-w1 iscsid[1632]: iscsid: re-opening session -1 (reopen_cnt 55046)
Oct 30 16:23:03 pbm-kube-0-w1 iscsid[1632]: iscsid: disconnecting conn 0x56483cb55e60, fd 9
Oct 30 16:23:03 pbm-kube-0-w1 iscsid[1632]: iscsid: connecting to 169.254.0.2:3260
Oct 30 16:23:03 pbm-kube-0-w1 iscsid[1632]: iscsid: connect to 169.254.0.2:3260 failed (Connection refused)
Oct 30 16:23:03 pbm-kube-0-w1 iscsid[1632]: iscsid: deleting a scheduled/waiting thread!
Ensure that:
-
All worker nodes have the NFS packages installed for their Operating System:
$ yum install nfs-utils -y
-
All worker nodes are running the daemon
rpc.statd
The Kubernetes Volume Snapshot feature became GA in Kubernetes v1.20.
When installing from the example helm charts, the snapshot controller, required RBAC roles and CRDs, will be deployed simultaneously with the driver. If your Kubernetes deployment already contains a snapshot deployment, modify the helm example deployment as needed.
After deployment there are resources applied that relate to snapshots, such as:
customresourcedefinition.apiextensions.k8s.io/volumesnapshotclasses.snapshot.storage.k8s.io created
customresourcedefinition.apiextensions.k8s.io/volumesnapshotcontents.snapshot.storage.k8s.io created
customresourcedefinition.apiextensions.k8s.io/volumesnapshots.snapshot.storage.k8s.io created
serviceaccount/snapshot-controller created
clusterrole.rbac.authorization.k8s.io/snapshot-controller-runner created
clusterrolebinding.rbac.authorization.k8s.io/snapshot-controller-role created
role.rbac.authorization.k8s.io/snapshot-controller-leaderelection created
rolebinding.rbac.authorization.k8s.io/snapshot-controller-leaderelection created
statefulset.apps/snapshot-controller created
The details of them can be viewed using kubectl get command:
NAME READY STATUS RESTARTS AGE
pod/snapshot-controller-0 1/1 Running 0 5h22m
...
NAME READY AGE
statefulset.apps/snapshot-controller 1/1 5h22m
A sample Helm chart is available in the deploy/helm directory, this method can be used for simpler deployment than the section below.
Create a local-values.yaml file that, at a minimum, sets the values for the zfssaInformation section. Depending on your environment, the image block may also need updates if the identified repositories cannot be reached.
The secrets must be encoded. There are many ways to Base64 strings and files, this technique would encode a user name of 'demo' for use in the values file on a Mac with the base64 tool installed:
echo -n 'demo' | base64
The following example shows how to get the server certificate of ZFSSA and encode it:
openssl s_client -connect <zfssa>:215 2>/dev/null </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | base64
Deploy the driver using Helm 3:
helm install -f local-values/local-values.yaml zfssa-csi ./k8s-1.17
When all pods are running, move to verification.
To deploy the plugin using YAML files, follow the steps listed below. They assume you are installing at least version 0.4.0 of the plugin on a cluster running version 1.17 of Kubernetes, and you are using Kubernetes secrets to provide the appliance login access information and certificate. They also use generic information described below. When following these steps change the values to your own values.
Information | Value |
---|---|
Appliance name or IP address | myappliance |
Appliance login user | mylogin |
Appliance password | mypassword |
Appliance file certificate | mycertfile |
Appliance storage pool | mypool |
Appliance storage project | myproject |
Cluster namespace | myspace |
-
The driver requires a file (zfssa.yaml) mounted as a volume to /mnt/zfssa. The volume should be an in memory volume and the file should be provided by a secure secret service that shares the secret via a sidecar, such as a Hashicorp Vault agent that interacts with vault via role-based access controls.
username: <text> password: <text>
For development only, other mechanisms can be used to create and share the secret with the container.
The driver uses a YAML parser for the parsing of this file. Because passwords should contain a variety of 'special characters', enclose the password in double quotes
Warning Do not store your credentials in source code control (such as this project). For production environments use a secure secret store that encrypts at rest and can provide credentials through role based access controls (refer to Kubernetes documentation). Do not use root user in production environments, a purpose-built user will provide better audit controls and isolation for the driver.
-
Create the Kubernetes secret containing the certificate chain for the appliance and make it available to the driver in a mounted volume (/mnt/certs) and file name of zfssa.crt. While a certificate chain is a public document, it is typically also provided by a volume mounted from a secret provider to protect the chain of trust and bind it to the instance.
To create a Kubernetes-secret from the certificate chain:
kubectl create secret generic oracle.zfssa.csi.node.myappliance.certs -n myspace --from-file=./mycertfile
For development only, it is possible to run without the appliance chain of trust, see the options for the driver.
-
Update the deployment files.
-
zfssa-csi-plugin.yaml
In the
DaemonSet
section make the following modifications:- in the container node-driver-registar subsection
- set
image
to the appropriate container image.
- set
- in the container zfssabs subsection
- set
image
for the container zfssabs to the appropriate container image. - in the
env
subsection- under
ZFSSA_TARGET
setvalueFrom.secretKeyRef.name
to oracle.zfssa.csi.node.myappliance - under
ZFSSA_INSECURE
setvalue
to False (if you choose to SECURE the communication with the appliance otherwise let it set to True)
- under
- set
- in the container node-driver-registar subsection
-
in the volume subsection, (skip if you set
ZFSSA_INSECURE
to True)- under
cert
, if you want communication with the appliance to be secure- set
secret.secretName
to oracle.zfssa.csi.node.myappliance.certs - set
secret.secretName.items.key
to mycertfile - set
secret.secretName.items.path
to mycertfile
- set
- under
-
zfssa-csi-provisioner.yaml
In the
StatefulSet
section make the following modifications:- set
image
for the container zfssa-csi-provisioner to the appropriate image. - set
image
for the container zfssa-csi-attacher to the appropriate container image.
- set
-
-
Deploy the plugin running the following commands:
kubectl apply -n myspace -f ./zfssa-csi-rbac.yaml kubectl apply -n myspace -f ./zfssa-csi-plugin.yaml kubectl apply -n myspace -f ./zfssa-csi-provisioner.yaml
At this point the command
kubectl get all -n myspace
should return something similar to this:NAME READY STATUS RESTARTS AGE pod/zfssa-csi-nodeplugin-lpts9 2/2 Running 0 3m22s pod/zfssa-csi-nodeplugin-vdb44 2/2 Running 0 3m22s pod/zfssa-csi-provisioner-0 2/2 Running 0 72s NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/zfssa-csi-nodeplugin 2 2 2 2 2 <none> 3m16s NAME READY AGE statefulset.apps/zfssa-csi-provisioner 1/1 72s
###Deployment Example Using an NFS Share
Refer to the NFS EXAMPLE README file for details.
###Deployment Example Using a Block Volume
Refer to the BLOCK EXAMPLE README file for details.