My goal of this blog is to share my experience on installing OCP 3.7 on Azure. With the information provided here, hope I can save you some time and have a successful installation on Azure. The exercise was to install OCP on Azure and enable dynamic provisioning as well. Here is the list that you may want to pay more attention when you plan your installation.
Use Azure instance name as the value of openshift_hostname in your inventory file
Assuming you are using OpenShift advanced installation on Azure, we will need to configure openshift_hostname in the inventory file to match the instance name in your Azure portal. The azure instance name is the name of your virtual machine.
When creating virtual machines in your resource group, it will depends if you are using Azure domain or custom domain of yours. If you are not using the Azure domain, you can set your domain on you VM via the following command:
hostnamectl set-hostname <FQDN of the host>
Once the ansible inventory file was created, execute the following steps to make sure the Network Manager is working properly.
Configure resolv.conf on all hosts
export DOMAIN=`domainname -d` ansible all -m shell -a "systemctl restart NetworkManager" ansible all -m shell -a "nmcli con modify eth0 ipv4.dns-search $DOMAIN" ansible all -m shell -a "systemctl restart NetworkManager"
Create service principal
This is a required step to setup Azure dynamic provision storage. There are few option for creating service principal. An example is shown below. Please see https://github.com/microsoft/openshift-container-platform for more details around it.
az ad sp create-for-rbac -n <friendly name> --password <password> --role contributor --scopes /subscriptions/<subscription_id>
Location in azure.conf is not optional for managed and unmanaged disk
If you are using managed disk option, you would want to make sure location is also in your azure.conf.
Here is the example of /etc/azure/azure.conf.
tenantId: xxxxxxxxxx subscriptionId: xxxxxxxxxxxxxx aadClientId: xxxxxxxxxx aadClientSecret: xxxxx aadTenantId: xxxxxxxxxx resourceGroup: xxxx cloud: AzurePublicCloud location: westus
Did not need to remove the node
The registration is automated. Once the installation completed, adding /etc/azure/azure.conf on all nodes
Update the master-config.yml with the following information:
kubernetesMasterConfig: apiServerArguments: cloud-config: - /etc/azure/azure.conf cloud-provider: - azure runtime-config: - apis/settings.k8s.io/v1alpha1=true storage-backend: - etcd3 storage-media-type: - application/vnd.kubernetes.protobuf controllerArguments: cloud-provider: - azure cloud-config: - /etc/azure/azure.conf
restart atomic-openshift-master-api and atomic-openshift-master-controllers on the master via following commands:
systemctl restart atomic-openshift-master-api systemctl restart atomic-openshift-master-controllers
Update all node-config.yaml as below
kubeletArguments: cloud-config: - /etc/azure/azure.conf cloud-provider: - azure
restart atomic-openshift-node on the nodes via following command
systemctl restart atomic-openshift-node
While you are starting up the atomic-openshift node, you can watch the log by running on the node
journalctl -u atomic-openshift-node -f
You should see the node get remove and re-register back with the cloud provider. There is no need to delete the node. If the below command whiling restart the atomic-node-service, you will see the specific node get removed and added back to the cluster after sometime.
oc get nodes -w
Creating the correct storage type for your environment
You can validate your disk option before creating storageclass. If the type of storageclass does not match, the pod will throw error as it start to mount to the volume.
At the VM creation, the managed and unmanaged disk options are available. I learned that we will need location string in the /etc/azure/azure.conf for managed disk option. It looks like RWX is not an option for Azure disk volume plug-in. There are the storageclass for unmanaged disk (top) and managed disk (bottom) I tried.
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: azure-storageclass annotations: storageclass.beta.kubernetes.io/is-default-class: "true" provisioner: kubernetes.io/azure-disk parameters: storageAccount: xxxxxxxx
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: slow2 provisioner: kubernetes.io/azure-disk parameters: storageaccounttype: Standard_LRS kind: Managed location: westus
To set the storageclass as default
oc annotate storageclass azure-storageclass storageclass.beta.kubernetes.io/is-default-class="true"
Now, you should be able to install an OpenShift cluster on Azure with Azure storage as default storageclass.