I would like to share my configuration for testing vSphere volume on OpenShift here. I hope you will have a better experience with it after reading this blog.
vSphere Configuration
- Create a folder “RHEL” for all the VMs
- Create OPENSHIFT as the resource pool and assign all VMs under the same resource pool
- The name of the virtual machine must match the name of the nodes for the OpenShift cluster. For example, the name of the host is used in vCenter must match the name of the node that is used in the ansible inventory host for installation. For example, the name of the VM is pocnode.sc.ocpdemo.online and I used the same FQDN in the OpenShift inventory file.
- To prepare the environment for vSphere cloud provider, steps are shown below.
- Set up the GOVC environment:
curl -LO https://github.com/vmware/govmomi/releases/download/v0.15.0/govc_linux_amd64.gz gunzip govc_linux_amd64.gz chmod +x govc_linux_amd64 cp govc_linux_amd64 /usr/bin/govc export GOVC_URL='vCenter IP OR FQDN' export GOVC_USERNAME='vCenter User' export GOVC_PASSWORD='vCenter Password' export GOVC_INSECURE=1
- Find the host VM path:
govc ls /datacenter/vm/<vm-folder-name>
- Set disk.EnableUUID to true for all VMs:
govc vm.change -e="disk.enableUUID=1" -vm='VM Path/vm-name'
OpenShift Configuration
vSphere.conf
To configure OpenShift to use vSphere volume, it is required to configure the vSphere cloud provider. To configure the cloud provider, you will need to create a file /etc/origin/cloudprovider/vsphere.conf as shown below.
[Global] user = "vcenter username" password = "vcenter password" port = "443" insecure-flag = "1" datacenters = "Datacenter" [VirtualCenter "1.2.3.4"] [Workspace] server = "1.2.3.4" datacenter = "Datacenter" folder = "/Datacenter/vm/RHEL" default-datastore = "Shared-NFS" resourcepool-path = "OPENSHIFT" [Disk] scsicontrollertype = pvscsi [Network] public-network = "VM Network 2"
Observation:
- use
govc ls
to figure out the value for ‘folder`- use just the name of the resource pool, not the entire path
Update master-config.yaml
Add the following to the /etc/origin/master/master-config.yaml
kubernetesMasterConfig: ... apiServerArguments: cloud-provider: - "vsphere" cloud-config: - "/etc/origin/cloudprovider/vsphere.conf" controllerArguments: cloud-provider: - "vsphere" cloud-config: - "/etc/origin/cloudprovider/vsphere.conf"
Update node-config.yaml
Add the following to the /etc/origin/node/node-config.yaml
kubeletArguments: cloud-provider: - "vsphere" cloud-config: - "/etc/origin/cloudprovider/vsphere.conf"
Restart services
From the master, restart services
master-restart api master-restart controllers systemctl restart atomic-openshift-node
From all the nodes, restart service
systemctl restart atomic-openshift-node
Remove node to add providerID
The following steps to delete nodes and restart node services. Observation from this step is the .spec.providerID was added to the node YAML file after delete and restart the node. Use the validation step below before and after deleting the node to review the node YAML details via `oc get node <name of node> -o json`.
- Check and backup existing node labels:
oc describe node <node_name> | grep -Poz '(?s)Labels.*\n.*(?=Taints)'
- Delete the nodes
oc delete node <node_name>
- Restart all node services
systemctl restart atomic-openshift-node
- Step to validate
#To make sure all nodes are Ready oc get nodes #To check .spec.providerID for each nodes is added oc get nodes -o json
Create vSphere storage-class
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: "vsphere-standard" provisioner: kubernetes.io/vsphere-volume parameters: diskformat: zeroedthick datastore: "Shared-NFS" reclaimPolicy: Delete
Let test it out
Create a PVC that uses the vSphere-volume storage-class
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: vsphere-test-storage annotations: volume.beta.kubernetes.io/storage-class: vsphere-standard spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi
Result
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE vsphere-test-storage Bound pvc-b70a6916-2d21-11e9-affb-005056a0e841 1Gi RWO vsphere-standard 22h
Troubleshooting
If it still does not work, here are some suggestions to debug:
- Check your resource pool if it is correct
The following command returns the list of VMs that belongs to “OPENSHIFT” resource pool. It should list out all your host for your OpenShift cluster.
govc pool.info -json /Datacenter/host/Cluster/Resources/OPENSHIFT | jq -r '.ResourcePools[].Vm[] | join(":")' | xargs govc ls -L
- Check if the node YAML has added
externalID and
providerID
after deleting and restartingatomic-openshift-node.
kubectl get nodes -o json | jq '.items[]|[.metadata.name, .spec.externalID, .spec.providerID]'
If you don’t have jq
, you can download from https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64