OpenShift v3.11 – Configure vSphere Cloud Provider

I would like to share my configuration for testing vSphere volume on OpenShift here. I hope you will have a better experience with it after reading this blog.

vSphere Configuration

  1. Create a folder “RHEL” for all the VMs
  2. Create OPENSHIFT as the resource pool and assign all VMs under the same resource pool
  3. The name of the virtual machine must match the name of the nodes for the OpenShift cluster. For example, the name of the host is used in vCenter must match the name of the node that is used in the ansible inventory host for installation. For example, the name of the VM is pocnode.sc.ocpdemo.online and I used the same FQDN in the OpenShift inventory file.
  4. To prepare the environment for vSphere cloud provider, steps are shown below.
  • Set up the GOVC environment:
curl -LO https://github.com/vmware/govmomi/releases/download/v0.15.0/govc_linux_amd64.gz
gunzip govc_linux_amd64.gz
chmod +x govc_linux_amd64
cp govc_linux_amd64 /usr/bin/govc
export GOVC_URL='vCenter IP OR FQDN'
export GOVC_USERNAME='vCenter User'
export GOVC_PASSWORD='vCenter Password'
export GOVC_INSECURE=1
  • Find the host VM path:
govc ls /datacenter/vm/<vm-folder-name>
  • Set disk.EnableUUID to true for all VMs:
govc vm.change -e="disk.enableUUID=1" -vm='VM Path/vm-name'

OpenShift Configuration

vSphere.conf

To configure OpenShift to use vSphere volume, it is required to configure the vSphere cloud provider. To configure the cloud provider, you will need to create a file /etc/origin/cloudprovider/vsphere.conf as shown below.

[Global] 
        user = "vcenter username" 
        password = "vcenter password" 
        port = "443" 
        insecure-flag = "1" 
        datacenters = "Datacenter" 
[VirtualCenter "1.2.3.4"] 

[Workspace] 
        server = "1.2.3.4" 
        datacenter = "Datacenter"
        folder = "/Datacenter/vm/RHEL" 
        default-datastore = "Shared-NFS" 
        resourcepool-path = "OPENSHIFT" 

[Disk]
        scsicontrollertype = pvscsi 
[Network]
        public-network = "VM Network 2"

Observation:

  • use govc ls to figure out the value for ‘folder`
  • use just the name of the resource pool, not the entire path

Update master-config.yaml

Add the following to the /etc/origin/master/master-config.yaml

kubernetesMasterConfig:
  ...
  apiServerArguments:
    cloud-provider:
      - "vsphere"
    cloud-config:
      - "/etc/origin/cloudprovider/vsphere.conf"
  controllerArguments:
    cloud-provider:
      - "vsphere"
    cloud-config:
      - "/etc/origin/cloudprovider/vsphere.conf"

Update node-config.yaml

Add the following to the /etc/origin/node/node-config.yaml

kubeletArguments:
  cloud-provider:
    - "vsphere"
  cloud-config:
    - "/etc/origin/cloudprovider/vsphere.conf"

Restart services

From the master, restart services

master-restart api
master-restart controllers
systemctl restart atomic-openshift-node

From all the nodes, restart service

systemctl restart atomic-openshift-node

Remove node to add providerID

The following steps to delete nodes and restart node services. Observation from this step is the .spec.providerID was added to the node YAML file after delete and restart the node. Use the validation step below before and after deleting the node to review the node YAML details via `oc get node <name of node> -o json`.

  • Check and backup existing node labels:
oc describe node <node_name> | grep -Poz '(?s)Labels.*\n.*(?=Taints)'
  • Delete the nodes
oc delete node <node_name>
  • Restart all node services
systemctl restart atomic-openshift-node
  • Step to validate
#To make sure all nodes are Ready
oc get nodes
#To check .spec.providerID for each nodes is added 
oc get nodes -o json

Create vSphere storage-class

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: "vsphere-standard" 
provisioner: kubernetes.io/vsphere-volume 
parameters:
    diskformat: zeroedthick 
    datastore: "Shared-NFS" 
reclaimPolicy: Delete

Let test it out

Create a PVC that uses the vSphere-volume storage-class

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: vsphere-test-storage
  annotations:
    volume.beta.kubernetes.io/storage-class: vsphere-standard
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Result

NAME                   STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS       AGE
vsphere-test-storage   Bound     pvc-b70a6916-2d21-11e9-affb-005056a0e841   1Gi        RWO            vsphere-standard   22h

Troubleshooting

If it still does not work, here are some suggestions to debug:

  • Check your resource pool if it is correct

The following command returns the list of VMs that belongs to “OPENSHIFT” resource pool. It should list out all your host for your OpenShift cluster.

govc pool.info -json /Datacenter/host/Cluster/Resources/OPENSHIFT | jq -r '.ResourcePools[].Vm[] | join(":")' | xargs govc ls -L
  • Check if the node YAML has added externalID and providerID after deleting and restarting atomic-openshift-node.
kubectl get nodes -o json | jq '.items[]|[.metadata.name, .spec.externalID, .spec.providerID]'

If you don’t have jq, you can download from https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s