Installing OCP 3.9 on Azure

This is an update from my previous blog about OpenShift 3.7 on Azure. OpenShift 3.9 is out and I tested the latest version on Azure.

I discovered that there are few things that are difference from version 3.7. I am going to share that in this blog and hope it will help someone out there.

The environment is using unmanaged disk for my VMs, and it is running RHEL 7.5. The version of OpenShift is 3.9.14.

Host Configuration

In this version, I still use the same rule as for configuring the nodes in the inventory file. Here is my latest example for the [nodes] session as shown below. It is important to have the openshift_hostname the same as the Azure instance names that show in the Azure portal.

10.0.0.5 openshift_ip=10.0.0.5 openshift_hostname=ocpnode1 openshift_node_labels="{'region': 'primary', 'zone': 'west'}"

NetworkManager

I did not need to touch NetworkManager in this test since I am using the default Azure domain in here. If you are using custom DNS for your VMs, I will still make sure the NetworkManager is working correctly before the installation. See my previous post for more information on this.

Ansible Inventory file example

Here is my sample inventory file (/etc/ansible/hosts) for installing OCP 3.9.14. In this test, my goal is to test cloud provider plugin on Azure.

[OSEv3:children]
masters
nodes
etcd
nfs
[OSEv3:vars]
ansible_ssh_user=root
deployment_type=openshift-enterprise
openshift_clock_enabled=true
openshift_disable_check=disk_availability,memory_availability,docker_image_availability,docker_storage
openshift_template_service_broker_namespaces=['openshift']
openshift_enable_service_catalog=true
template_service_broker_install=true
ansible_service_broker_local_registry_whitelist=['.*-apb$']
openshift_master_default_subdomain=apps.poc.openshift.online
openshift_hosted_router_selector='region=infra'
openshift_hosted_registry_selector='region=infra'
openshift_install_examples=true
openshift_docker_insecure_registries=172.30.0.0/16
openshift_hosted_registry_storage_nfs_directory=/exports
openshift_hosted_manage_router=true
openshift_hosted_manage_registry=true
openshift_hosted_registry_storage_kind=nfs
openshift_hosted_registry_storage_access_modes=['ReadWriteMany']
openshift_hosted_registry_storage_nfs_directory=/exports
openshift_hosted_registry_storage_nfs_options='*(rw,root_squash)'
openshift_hosted_registry_storage_volume_name=registry
openshift_hosted_registry_storage_volume_size=20Gi
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/openshift/openshift-passwd'}]
[masters]
10.0.0.4
[nfs]
10.0.0.4
[etcd]
10.0.0.4
[nodes]
10.0.0.4 openshift_ip=10.0.0.4 openshift_hostname=ocpmaster openshift_public_ip=1.2.3.4 openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_scheduleable=true openshift_public_hostname=ocpmaster.poc.openshift.online
10.0.0.5 openshift_ip=10.0.0.5 openshift_hostname=ocpnode1 openshift_node_labels="{'region': 'primary', 'zone': 'west'}"

Install the Cluster

After creating the ansible inventory file, you will execute the follow command to get OCP installed.

ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml

Service Principal

This is required for configuring Azure as cloud provider. Here is my example command to create my service principal.

az ad sp create-for-rbac -n ocpsp --password XXXX --role contributor --scopes /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Create /etc/azure/azure.conf File

Out of all the steps, this is the one of the steps where is different from the last version. Here is the azure.conf that I used in my test.

tenantID: xx1x11xx-x1x1-1x11-x1x1-x1x1111111x1
subscriptionID: 11xx111x-1x1x-1xxx-1x11-x1x1x1x11xx1
aadClientID: 1111111x-1111-1xxx-1111-x1111x111xxx
aadClientSecret: 0000000
resourceGroup: name-of-resource-group
cloud: AzurePublicCloud
location: westus
vnetName: virtual-network-name
securityGroupName: network-security-group-name

Update master-config.yaml and node-config.yaml

This configuration is the same as 3.7. Here is the sample configuration that I have in my /etc/origin/master/master-config.yaml and /etc/origin/node/node-config.yaml. Only adding the bold portion to your existing yaml files.

Configuration of master-config.yaml:

kubernetesMasterConfig:
  apiServerArguments:
    cloud-config:
    - /etc/azure/azure.conf
    cloud-provider:
    - azure
    runtime-config:
    - apis/settings.k8s.io/v1alpha1=true
    storage-backend:
    - etcd3
    storage-media-type:
    - application/vnd.kubernetes.protobuf
  controllerArguments:
    cloud-provider:
    - azure
    cloud-config:
    - /etc/azure/azure.conf 

Configuration of node-config.yaml:

kubeletArguments: 
  cloud-config:
  - /etc/azure/azure.conf
  cloud-provider:
  - azure
  node-labels:
  - region=infra
  - zone=default

Backup the Node labels

Just on the safe side, we should back up the node labels before starting the services. To gather the labels for each node, run oc get nodes --show-labels and save the output. In case your nodes did not have all the labels after restarting the services, you can restore them back from the backup.

Restart Services

Documentation indicates that removing the node for azure configuration to work. In my test, I DO NOT delete my nodes, the process of restarting would remove and add the nodes back to the cluster automatically. It is by observation how I get the cloud provider to work in this version.

  • Restart all master services
systemctl restart atomic-openshift-master-api
systemctl restart atomic-openshift-master-controllers

To monitoring the event during restart

journalctl -u atomic-openshift-master-api -f
journalctl -u atomic-openshift-master-controllers -f
  • Restart all nodes service
systemctl restart atomic-openshift-node

To monitoring the event during restart

journalctl -u atomic-openshift-node -f
oc get node -w

Notes: The journalctl shows the node will get removed from cluster list. Running oc get nodes to monitoring the list of nodes in the cluster. It will eventually add back the node to the list if every works correctly.

Update Roles and Labels

Role is defined for each nodes in OpenShift 3.9 cluster. It will look similar to the following from oc get nodes -o wide --show-labels

Output look similar to this

NAME        STATUS    ROLES     AGE       VERSION             EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION          CONTAINER-RUNTIME   LABELS

ocpmaster   Ready     master    11h       v1.9.1+a0ce1bc657   <none>        Employee SKU   3.10.0-862.el7.x86_64   docker://1.13.1     beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_E2s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=westus,failure-domain.beta.kubernetes.io/zone=0,kubernetes.io/hostname=ocpmaster,node-role.kubernetes.io/master=true,region=infra,zone=default

ocpnode1    Ready     compute   11h       v1.9.1+a0ce1bc657   <none>        Employee SKU   3.10.0-862.el7.x86_64   docker://1.13.1     beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_E2s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=westus,failure-domain.beta.kubernetes.io/zone=0,kubernetes.io/hostname=ocpnode1,node-role.kubernetes.io/compute=true,region=primary,zone=west

After restarting master and node services, you will need to add the role back to the nodes once their status show Ready. Here are the command to add the role back.

oc label node ocpmaster node-role.kubernetes.io/master=true
oc label node ocpnode1 node-role.kubernetes.io/compute=true

Setting up Azure Disk

In this test, I used unmanaged disk with my VMs, you will need to know what type of disk that you have before creating the storageclass for the cluster.

Here is how you will configure unmanged disk storageclass.

  • Create a storageclass.yaml file with the following information
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: azure-storageclass
provisioner: kubernetes.io/azure-disk
parameters:
  storageAccount: pocadmin
  • Run the follow command with the file that was created in the previous step
oc create -f storageclass.yaml
  • Make this storageclass as default
oc annotate storageclass azure-storageclass storageclass.beta.kubernetes.io/is-default-class="true"

 

Now, you will be able to use the Azure disk as your default storageclass from above steps.