OpenShift v3.11 – Configure vSphere Cloud Provider

I would like to share my configuration for testing vSphere volume on OpenShift here. I hope you will have a better experience with it after reading this blog.

vSphere Configuration

  1. Create a folder “RHEL” for all the VMs
  2. Create OPENSHIFT as the resource pool and assign all VMs under the same resource pool
  3. The name of the virtual machine must match the name of the nodes for the OpenShift cluster. For example, the name of the host is used in vCenter must match the name of the node that is used in the ansible inventory host for installation. For example, the name of the VM is pocnode.sc.ocpdemo.online and I used the same FQDN in the OpenShift inventory file.
  4. To prepare the environment for vSphere cloud provider, steps are shown below.
  • Set up the GOVC environment:
curl -LO https://github.com/vmware/govmomi/releases/download/v0.15.0/govc_linux_amd64.gz
gunzip govc_linux_amd64.gz
chmod +x govc_linux_amd64
cp govc_linux_amd64 /usr/bin/govc
export GOVC_URL='vCenter IP OR FQDN'
export GOVC_USERNAME='vCenter User'
export GOVC_PASSWORD='vCenter Password'
export GOVC_INSECURE=1
  • Find the host VM path:
govc ls /datacenter/vm/<vm-folder-name>
  • Set disk.EnableUUID to true for all VMs:
govc vm.change -e="disk.enableUUID=1" -vm='VM Path/vm-name'

OpenShift Configuration

vSphere.conf

To configure OpenShift to use vSphere volume, it is required to configure the vSphere cloud provider. To configure the cloud provider, you will need to create a file /etc/origin/cloudprovider/vsphere.conf as shown below.

[Global] 
        user = "vcenter username" 
        password = "vcenter password" 
        port = "443" 
        insecure-flag = "1" 
        datacenters = "Datacenter" 
[VirtualCenter "1.2.3.4"] 

[Workspace] 
        server = "1.2.3.4" 
        datacenter = "Datacenter"
        folder = "/Datacenter/vm/RHEL" 
        default-datastore = "Shared-NFS" 
        resourcepool-path = "OPENSHIFT" 

[Disk]
        scsicontrollertype = pvscsi 
[Network]
        public-network = "VM Network 2"

Observation:

  • use govc ls to figure out the value for ‘folder`
  • use just the name of the resource pool, not the entire path

Update master-config.yaml

Add the following to the /etc/origin/master/master-config.yaml

kubernetesMasterConfig:
  ...
  apiServerArguments:
    cloud-provider:
      - "vsphere"
    cloud-config:
      - "/etc/origin/cloudprovider/vsphere.conf"
  controllerArguments:
    cloud-provider:
      - "vsphere"
    cloud-config:
      - "/etc/origin/cloudprovider/vsphere.conf"

Update node-config.yaml

Add the following to the /etc/origin/node/node-config.yaml

kubeletArguments:
  cloud-provider:
    - "vsphere"
  cloud-config:
    - "/etc/origin/cloudprovider/vsphere.conf"

Restart services

From the master, restart services

master-restart api
master-restart controllers
systemctl restart atomic-openshift-node

From all the nodes, restart service

systemctl restart atomic-openshift-node

Remove node to add providerID

The following steps to delete nodes and restart node services. Observation from this step is the .spec.providerID was added to the node YAML file after delete and restart the node. Use the validation step below before and after deleting the node to review the node YAML details via `oc get node <name of node> -o json`.

  • Check and backup existing node labels:
oc describe node <node_name> | grep -Poz '(?s)Labels.*\n.*(?=Taints)'
  • Delete the nodes
oc delete node <node_name>
  • Restart all node services
systemctl restart atomic-openshift-node
  • Step to validate
#To make sure all nodes are Ready
oc get nodes
#To check .spec.providerID for each nodes is added 
oc get nodes -o json

Create vSphere storage-class

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: "vsphere-standard" 
provisioner: kubernetes.io/vsphere-volume 
parameters:
    diskformat: zeroedthick 
    datastore: "Shared-NFS" 
reclaimPolicy: Delete

Let test it out

Create a PVC that uses the vSphere-volume storage-class

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: vsphere-test-storage
  annotations:
    volume.beta.kubernetes.io/storage-class: vsphere-standard
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Result

NAME                   STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS       AGE
vsphere-test-storage   Bound     pvc-b70a6916-2d21-11e9-affb-005056a0e841   1Gi        RWO            vsphere-standard   22h

Troubleshooting

If it still does not work, here are some suggestions to debug:

  • Check your resource pool if it is correct

The following command returns the list of VMs that belongs to “OPENSHIFT” resource pool. It should list out all your host for your OpenShift cluster.

govc pool.info -json /Datacenter/host/Cluster/Resources/OPENSHIFT | jq -r '.ResourcePools[].Vm[] | join(":")' | xargs govc ls -L
  • Check if the node YAML has added externalID and providerID after deleting and restarting atomic-openshift-node.
kubectl get nodes -o json | jq '.items[]|[.metadata.name, .spec.externalID, .spec.providerID]'

If you don’t have jq, you can download from https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64

Configure and Troubleshoot LDAP on OpenShift

One of the most frequent asked questions that I get is how to configure LDAP on OpenShift. Instead of replying back with my PDF every time I get a request about this. It maybe good to share the info here, so I can always refer back to. As for configuring LDAP on OpenShift is pretty straight forward if you have all the correct information to connect. In the blog I will walk you through the configuration for both pre & post installation options. Also, I will provide some of the troubleshooting steps on how to debug if you run into issues.

Problem: Can’t login with LDAP users.

AD usually is using sAMAccountName as uid for login. LDAP usually is using uid for login

Step 1

Use the following ldapsearch to validate the informaiton was given by customer: ldapsearch -x – D “CN=xxx,OU=Service-Accounts,OU=DCS,DC=homeoffice,DC=example,DC=com” \ -W -H ldaps://ldaphost.example.com -b “ou=Users,dc=office,dc=example,DC=com” \ -s sub ‘sAMAccountName=user1’

If the ldapsearch did not return any user, it means -D or -b may not be correct. Retry different baseDN. If there is too many enteries returns, add filter to your search. Filter example is (objectclass=people)

filter example: (objectclass=person)

Step 2

Logging: set OPTIONS=–loglevel=5 in /etc/sysconfig/atomic-openshift-master step 3

Since customer had htpasswd provider setup before switch to Active Directory and the user identity was created for the same users. In journalctl -u atomic-openshift-master, it logged conflict with the user identity when user trying to login.

Here was the step
oc get identity 
oc delete identity <name_of_identity_that_user1> 
oc get user oc delete user user1
Inspiration from :
Final configuration in master-config.yaml was as shown below.

oauthConfig:
  assetPublicURL: https://master.example.com:8443/console/
  grantConfig:
    method: auto
  identityProviders:
  - name: "OfficeAD"
    challenge: true
    login: true
    provider:
      apiVersion: v1
      kind: LDAPPasswordIdentityProvider
      attributes:
        id:
        - dn
        email:
        - mail
        name:
        - cn
        preferredUsername:
        - sAMAccountName
      bindDN: "CN=LinuxSVC,OU=Service-Accounts,OU=DCS,DC=office,DC=example,DC=com"
      bindPassword: "password"
      ca: ad-ca.pem.crt
      insecure: false
      url: "ldaps://ad-server.example.com:636/CN=Users,DC=hoffice,DC=example,DC=com?sAMAccountName?sub"

Installing OCP 3.9 on Azure

This is an update from my previous blog about OpenShift 3.7 on Azure. OpenShift 3.9 is out and I tested the latest version on Azure.

I discovered that there are few things that are difference from version 3.7. I am going to share that in this blog and hope it will help someone out there.

The environment is using unmanaged disk for my VMs, and it is running RHEL 7.5. The version of OpenShift is 3.9.14.

Host Configuration

In this version, I still use the same rule as for configuring the nodes in the inventory file. Here is my latest example for the [nodes] session as shown below. It is important to have the openshift_hostname the same as the Azure instance names that show in the Azure portal.

10.0.0.5 openshift_ip=10.0.0.5 openshift_hostname=ocpnode1 openshift_node_labels="{'region': 'primary', 'zone': 'west'}"

NetworkManager

I did not need to touch NetworkManager in this test since I am using the default Azure domain in here. If you are using custom DNS for your VMs, I will still make sure the NetworkManager is working correctly before the installation. See my previous post for more information on this.

Ansible Inventory file example

Here is my sample inventory file (/etc/ansible/hosts) for installing OCP 3.9.14. In this test, my goal is to test cloud provider plugin on Azure.

[OSEv3:children]
masters
nodes
etcd
nfs
[OSEv3:vars]
ansible_ssh_user=root
deployment_type=openshift-enterprise
openshift_clock_enabled=true
openshift_disable_check=disk_availability,memory_availability,docker_image_availability,docker_storage
openshift_template_service_broker_namespaces=['openshift']
openshift_enable_service_catalog=true
template_service_broker_install=true
ansible_service_broker_local_registry_whitelist=['.*-apb$']
openshift_master_default_subdomain=apps.poc.openshift.online
openshift_hosted_router_selector='region=infra'
openshift_hosted_registry_selector='region=infra'
openshift_install_examples=true
openshift_docker_insecure_registries=172.30.0.0/16
openshift_hosted_registry_storage_nfs_directory=/exports
openshift_hosted_manage_router=true
openshift_hosted_manage_registry=true
openshift_hosted_registry_storage_kind=nfs
openshift_hosted_registry_storage_access_modes=['ReadWriteMany']
openshift_hosted_registry_storage_nfs_directory=/exports
openshift_hosted_registry_storage_nfs_options='*(rw,root_squash)'
openshift_hosted_registry_storage_volume_name=registry
openshift_hosted_registry_storage_volume_size=20Gi
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/openshift/openshift-passwd'}]
[masters]
10.0.0.4
[nfs]
10.0.0.4
[etcd]
10.0.0.4
[nodes]
10.0.0.4 openshift_ip=10.0.0.4 openshift_hostname=ocpmaster openshift_public_ip=1.2.3.4 openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_scheduleable=true openshift_public_hostname=ocpmaster.poc.openshift.online
10.0.0.5 openshift_ip=10.0.0.5 openshift_hostname=ocpnode1 openshift_node_labels="{'region': 'primary', 'zone': 'west'}"

Install the Cluster

After creating the ansible inventory file, you will execute the follow command to get OCP installed.

ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml

Service Principal

This is required for configuring Azure as cloud provider. Here is my example command to create my service principal.

az ad sp create-for-rbac -n ocpsp --password XXXX --role contributor --scopes /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Create /etc/azure/azure.conf File

Out of all the steps, this is the one of the steps where is different from the last version. Here is the azure.conf that I used in my test.

tenantID: xx1x11xx-x1x1-1x11-x1x1-x1x1111111x1
subscriptionID: 11xx111x-1x1x-1xxx-1x11-x1x1x1x11xx1
aadClientID: 1111111x-1111-1xxx-1111-x1111x111xxx
aadClientSecret: 0000000
resourceGroup: name-of-resource-group
cloud: AzurePublicCloud
location: westus
vnetName: virtual-network-name
securityGroupName: network-security-group-name

Update master-config.yaml and node-config.yaml

This configuration is the same as 3.7. Here is the sample configuration that I have in my /etc/origin/master/master-config.yaml and /etc/origin/node/node-config.yaml. Only adding the bold portion to your existing yaml files.

Configuration of master-config.yaml:

kubernetesMasterConfig:
  apiServerArguments:
    cloud-config:
    - /etc/azure/azure.conf
    cloud-provider:
    - azure
    runtime-config:
    - apis/settings.k8s.io/v1alpha1=true
    storage-backend:
    - etcd3
    storage-media-type:
    - application/vnd.kubernetes.protobuf
  controllerArguments:
    cloud-provider:
    - azure
    cloud-config:
    - /etc/azure/azure.conf 

Configuration of node-config.yaml:

kubeletArguments: 
  cloud-config:
  - /etc/azure/azure.conf
  cloud-provider:
  - azure
  node-labels:
  - region=infra
  - zone=default

Backup the Node labels

Just on the safe side, we should back up the node labels before starting the services. To gather the labels for each node, run oc get nodes --show-labels and save the output. In case your nodes did not have all the labels after restarting the services, you can restore them back from the backup.

Restart Services

Documentation indicates that removing the node for azure configuration to work. In my test, I DO NOT delete my nodes, the process of restarting would remove and add the nodes back to the cluster automatically. It is by observation how I get the cloud provider to work in this version.

  • Restart all master services
systemctl restart atomic-openshift-master-api
systemctl restart atomic-openshift-master-controllers

To monitoring the event during restart

journalctl -u atomic-openshift-master-api -f
journalctl -u atomic-openshift-master-controllers -f
  • Restart all nodes service
systemctl restart atomic-openshift-node

To monitoring the event during restart

journalctl -u atomic-openshift-node -f
oc get node -w

Notes: The journalctl shows the node will get removed from cluster list. Running oc get nodes to monitoring the list of nodes in the cluster. It will eventually add back the node to the list if every works correctly.

Update Roles and Labels

Role is defined for each nodes in OpenShift 3.9 cluster. It will look similar to the following from oc get nodes -o wide --show-labels

Output look similar to this

NAME        STATUS    ROLES     AGE       VERSION             EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION          CONTAINER-RUNTIME   LABELS

ocpmaster   Ready     master    11h       v1.9.1+a0ce1bc657   <none>        Employee SKU   3.10.0-862.el7.x86_64   docker://1.13.1     beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_E2s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=westus,failure-domain.beta.kubernetes.io/zone=0,kubernetes.io/hostname=ocpmaster,node-role.kubernetes.io/master=true,region=infra,zone=default

ocpnode1    Ready     compute   11h       v1.9.1+a0ce1bc657   <none>        Employee SKU   3.10.0-862.el7.x86_64   docker://1.13.1     beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=Standard_E2s_v3,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=westus,failure-domain.beta.kubernetes.io/zone=0,kubernetes.io/hostname=ocpnode1,node-role.kubernetes.io/compute=true,region=primary,zone=west

After restarting master and node services, you will need to add the role back to the nodes once their status show Ready. Here are the command to add the role back.

oc label node ocpmaster node-role.kubernetes.io/master=true
oc label node ocpnode1 node-role.kubernetes.io/compute=true

Setting up Azure Disk

In this test, I used unmanaged disk with my VMs, you will need to know what type of disk that you have before creating the storageclass for the cluster.

Here is how you will configure unmanged disk storageclass.

  • Create a storageclass.yaml file with the following information
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: azure-storageclass
provisioner: kubernetes.io/azure-disk
parameters:
  storageAccount: pocadmin
  • Run the follow command with the file that was created in the previous step
oc create -f storageclass.yaml
  • Make this storageclass as default
oc annotate storageclass azure-storageclass storageclass.beta.kubernetes.io/is-default-class="true"

 

Now, you will be able to use the Azure disk as your default storageclass from above steps.

Installing OpenShift behind proxy

I have been wanting to write this blog to summarize the challenges that I had with proxy when installing OpenShift. The truth is that I don’t have the complete list of how to solve every problem in a proxy environment. I will try to list out what I did in my past and help you to avoid or debug the proxy related issues as much as I can.

Environment Variable

  • Whitelisting hosts that the platform will be accessing, for example:

If you are using subscription manager on a RHEL, you will need to whitelist the following hosts.

For RHSM/RHN (rpms):

For RH’s docker registry:

Example of other access that you may need:

    • index.docker.io
    • github.com
    • maven.org (Maven Central)
    • docker.io (dockerhub connection)
    • npmjs.org (node js build)
  • Setup /etc/profile.d/proxy.sh on all the nodes for your platform
#cat /etc/profile.d/proxy.sh
export http_proxy=http://host.name:port/
export https_proxy=http://host.name:port/
export no_proxy=.example.com,.svc

Note: “.svc” is needed with you want to install Service Catalog

Add Proxy Information into Ansible Hosts File

Here is the list of parameters for proxy environment

openshift_http_proxy=http://IPADDR:PORT
openshift_https_proxy=https://IPADDR:PORT
openshift_no_proxy='.example.com,some-internal-hosts.com'

Update Dockerfile with Proxy Information

After installing, the internal docker registry service IP will need to add to the docker configuration in /etc/sysconfig/docker for NO_PROXY parameter.

Getting the service IP of docker-register on OpenShift

oc get svc docker-registry -n default

Append the service IP to the NO_PROXY list in the file.

Testing Build and Pushing Images to Registry

After the installation, it is always good to test out the build and make sure image can be pushed into the internal registry.

Here is my check list if the build failed or image cannot be pushed.

  • Check if the hosts that you are trying to access are on the whitelist for your proxy
  • Check if the gitNoProxy is configured correctly under the BuildDefaults plug-in in the /etc/origin/master/master-config.yaml. For example, if you are access to an internal git repo location, please make sure they repository server is on the gitNoProxy list.
  • In 3.7, you will also need to add the kubernetes service IP to the NO_PROXY environment variable and redeploy the docker-register. Otherwise, you will get error when trying to push images to the internal docker register. See this link for more details: https://bugzilla.redhat.com/show_bug.cgi?id=1511870.

Note: to get the service IP for kubernetes: oc get svc kubernetes -n default

Hopefully, the checklist will help you to avoid any proxy related issue during installation.

Install OpenShift on Atomic Host on AWS

This blog is to share my experience on installing OpenShift 3.7 on Atomic Host. Since I am using OpenShift Container Platform (supported by Red Hat), there are 2 options for installation. They are the RPM install which is on Red Hat Enterprise Linux (RHEL) and containerized install which is using Atomic Host (Container OS). In version 3.7, installation can be done via a container on Atomic Host, or from an Linux bastion host.

My test used a Linux host to install OpenShift on Atomic host using AWS which is one of the ways to get an Atomic instance provisioned. Atomic host are provisioned via private AMI image for cloud provider account, the AMI image is ami-e9494989 for my test. Here is where you need to register to get access for importing the the private AMI here https://www.redhat.com/en/technologies/cloud-computing/cloud-access.

There are many ways to automate the steps for installation which is not my blog is about. I want to test out how easy or hard to installation OpenShift on a container OS, so I want to test all the steps for the installation manually.

Setting Up on a Cloud Provider

There are few things we need to setup on AWS. A wildcard entry and a public master hostname are required prior to the installation. I used Route53 for adding the A records for both of the requirements.

Per my test, I also had to add a tag to all Atomic instances with key as KubernetesCluster and the value of the key can be anything. The value of the KubernetesCluster key, will be used for parameter openshift_clusterid in the ansible inventory file. Without this tag on the Atomic instances, I will not be able to register the OpenShift node with the cloud provider.

Setting Up Bastion host

Bastion host is a Linux host (RHEL) to run automation scripts to prepare and install OpenShift on all Atomic hosts. This is one of the option to install on Atomic host. I like this option because I can reuse the same bastion host to install more that one cluster.

The step to prepare the bastion host is straight forward.

subscription-manager register
subscription-manager attach --pool=
subscription-manager repos --disable="*"
subscription-manager repos \
    --enable="rhel-7-server-rpms" \
    --enable="rhel-7-server-extras-rpms" \
    --enable="rhel-7-server-ose-3.7-rpms" \
    --enable="rhel-7-fast-datapath-rpms"
yum install atomic-openshift-utils -y

Preparation before installation.

Preparation steps are available at https://docs.openshift.com/container-platform/latest/install_config/install/host_preparation.html.

1. Generate SSH key on Bastion host via 'ssh-keygen' as root

2. Distribute the SSH key too all hosts (master and node) using the following command from bastion host:
   ssh-copy-id -i ~/.ssh/id_rsa.pub 

3. Create a hosts.prepocp file which include all the hostnames for the cluster. 
   Example is shown below.
   [nodes]
   ip-172-31-7-15.us-west-2.compute.internal
   ip-172-31-5-243.us-west-2.compute.internal

4. Create a ansible-play (openshiftprep.yml) to automate the host preparation. 
   An Example is: here https://github.com/piggyvenus/examples/blob/master/installAnsibleSample/v3.7/atomic/openshiftprep.yml

5. Execute the ansible playbook which will register, update Atomic host and configure docker on to the added device (/dev/xvdb)
   ansible-playbook -i hosts.prepocp openshiftprep.yml

Create Ansible Hosts file for OpenShift Advance Installation

Since we will need to create an inventory file (often refer to ansible hosts file) for OpenShift installation, here is an example of OpenShift Advance Installation for Atomic host on AWS: https://raw.githubusercontent.com/piggyvenus/examples/master/installAnsibleSample/v3.7/atomic/hosts.atomic.template

Download this file and update the corresponding information for installation. Then, save this file as /etc/ansible/hosts.

There are a few lesson learned here as for creating the ansible hosts file. I added the following parameters to get successful installation:

openshift_release=v3.7.23
openshift_image_tag=v3.7.23
openshift_pkg_version=-3.7.23
openshift_clusterid=<value of key KubernetesCluster from AWS Atomic instance>

Setting up Atomic Host For OCP Installation

I learned that there are few extra steps which I had to add to prepare the Atomic installation before OpenShift Installation. These are the steps that I used in my test. I am not an Atomic expert. It does what I wanted it to do.

Besides adding disk for docker, I also need extra disk space for root partition. The OOTB Atomic instance from AWS has only 3GB root partition which is not enough for OpenShift installer. I have to do the following to get my docker and root partition configure to the way I wanted it. My goal is to extend my root partition to have extra disk space and configure docker using the added volume that I attached to the instance.

The preparation script did configure docker to use /dev/xvdb. After running the previous ansible playbook, the following steps were to use to extend my root partition.

ansible all -m shell -a "lvextend -L+50G /dev/mapper/atomicos-root"
ansible all -m shell -a "xfs_growfs /"
ansible all -m shell -a "df -h"

Next is to reboot all hosts via the following command.

systemctl reboot

The following step is to configure docker and startup docker after all hosts were rebooted from the bastion host.

1. Download this ansible playbook 
   https://raw.githubusercontent.com/piggyvenus/examples/master/installAnsibleSample/v3.7/atomic/openshiftprep2.yml
2. Run ansible playbook using the same hosts.prepocp file as shown below. 
   ansible-playbook -i hosts.prepocp openshiftprep2.yml

OpenShift Containerized Installation

Once the ansible host (/etc/ansible/hosts) is updated, installation can be started by executing the following command.

ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml

note: if the inventory file is /etc/ansible/hosts, no need to specify with "-i" option.

If there is no error after this step, you can access the OpenShift console via https://<your-public-master-hostname&gt;:8443/ and login as any username and password.

Setting up Persistence for Registry for Non-Production

There are many options to setup the persistence for Registry on AWS. Since I only have 1 registry for the single master cluster, I decided to use what gp2 which is configured as default storageclass after the installation. Here are the steps I setup storage for OpenShift insternal registry. I used ReadWriteOnce as access mode because AWSElasticBlockStore volume plugin only support ReadWriteOnce (https://kubernetes.io/docs/concepts/storage/persistent-volumes/)

1. ssh to the Atomic master host
2. /usr/local/bin/oc login -u system:admin
3. oc project default
4. run the following:
oc create -f - <<EOF
{
  "apiVersion": "v1",
  "kind": "PersistentVolumeClaim",
  "metadata": {
 "name": "registry-volume-claim",
 "labels": {
   "deploymentconfig": "docker-registry"
 }
  },
  "spec": {
 "accessModes": [ "ReadWriteOnce" ],
 "resources": {
   "requests": {
     "storage": "20Gi"
   }
        }
   }
}
EOF

2. oc volume deploymentconfigs/docker-registry --add --name=registry-storage -t pvc  --claim-name=registry-volume-claim --overwrite

Setting up Metrics with Dynamic storage

Since the dynamic provisioning is configured, I used the default gp2 storageclass to configure metics as well. Here are the steps.

1. Add following in /etc/ansible/hosts file on bastion host
openshift_metrics_install_metrics=true
openshift_metrics_hawkular_hostname=hawkular-metrics.<your wildcard suffix>
openshift_metrics_cassandra_storage_type=dynamic
openshift_metrics_image_version=v3.7.23

2. Run the metrics playbook to setup metrics from bastion host
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-metrics.yml 

If the playbook failed, simply uninstall metrics component by setting openshift_metrics_install_metrics=false and re-run the metric playbook.

Setting up Logging with Dynamic Storage

Logging can be configured via ansible playbook as well. I am using the default gp2 storageclass since it provides dynamic provision for the Persistence Volume.  Here are the steps.

1. Add following in /etc/ansible/hosts file on bastion host
openshift_logging_install_logging=true
openshift_logging_image_version=v3.7.23
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_pvc_size=30Gi
openshift_logging_es_cluster_size=1
openshift_logging_es_memory_limit=1Gi

2.Run the logging playbook to setup logging from bastion host
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-logging.yml

If the playbook failed, simply uninstall logging component by setting openshift_logging_install_logging=false and re-run the logging playbook.

Installing from a Container on an Atomic Host Option

Instead of using a bastion host to execute ansible playbook to install on an Atomic host. You can execute the following command to install OpenShift from a container. Here are the steps I tested.

1.  atomic install --system \
> --storage=ostree \
> --set INVENTORY_FILE=/root/hosts \
registry.access.redhat.com/openshift3/ose-ansible:v3.7
Getting image source signatures
Copying blob sha256:9cadd93b16ff2a0c51ac967ea2abfadfac50cfa3af8b5bf983d89b8f8647f3e4
 71.41 MB / ? [----------------------------------=-------------------------] 7s 
Copying blob sha256:4aa565ad8b7a87248163ce7dba1dd3894821aac97e846b932ff6b8ef9a8a508a
 1.21 KB / ? [=------------------------------------------------------------] 0s 
Copying blob sha256:7952714329657fa2bb63bbd6dddf27fcf717186a9613b7fab22aeb7f7831b08a
 146.93 MB / ? [---------------------------------------------=------------] 16s 
Copying config sha256:45abc081093b825a638ec53a19991af0612e96e099554bbdfa88b341cdfcd2e6
 4.23 KB / 4.23 KB [========================================================] 0s
Writing manifest to image destination
Storing signatures
Extracting to /var/lib/containers/atomic/ose-ansible-v3.7.0
systemctl daemon-reload
systemd-tmpfiles --create /etc/tmpfiles.d/ose-ansible-v3.7.conf
systemctl enable ose-ansible-v3.7

2. systemctl start ose-ansible-v3.7
3. journalctl -xfu ose-ansible-v3.7

Hope this will help someone to have a successful OpenShift containerized installation.

Installing OCP 3.7 on Azure

My goal of this blog is to share my experience on installing OCP 3.7 on Azure. With the information provided here, hope I can save you some time and have a successful installation on Azure. The exercise was to install OCP on Azure and enable dynamic provisioning as well. Here is the list that you may want to pay more attention when you plan your installation.

Use Azure instance name as the value of openshift_hostname in your inventory file

Assuming you are using OpenShift advanced installation on Azure, we will need to configure openshift_hostname in the inventory file to match the instance name in your Azure portal. The azure instance name is the name of your virtual machine.

Host Configuration

When creating virtual machines in your resource group, it will depends if you are using Azure domain or custom domain of yours. If you are not using the Azure domain, you can set your domain on you VM via the following command:

hostnamectl set-hostname <FQDN of the host>

Setup NetworkManager

Once the ansible inventory file was created, execute the following steps to make sure the Network Manager is working properly.

ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-node/network_manager.yml

Configure resolv.conf on all hosts

export DOMAIN=`domainname -d`

ansible all -m shell -a "systemctl restart NetworkManager"

ansible all -m shell -a "nmcli con modify eth0 ipv4.dns-search $DOMAIN"

ansible all -m shell -a "systemctl restart NetworkManager"

Create service principal

This is a required step to setup Azure dynamic provision storage. There are few option for creating service principal. An example is shown below.  Please see https://github.com/microsoft/openshift-container-platform for more details around it.

az ad sp create-for-rbac -n <friendly name> --password <password> --role contributor --scopes /subscriptions/<subscription_id>

Location in azure.conf is not optional for managed and unmanaged disk

If you are using managed disk option, you would want to make sure location is also in your azure.conf.

Here is the example of /etc/azure/azure.conf.

tenantId: xxxxxxxxxx
subscriptionId: xxxxxxxxxxxxxx
aadClientId: xxxxxxxxxx
aadClientSecret: xxxxx
aadTenantId: xxxxxxxxxx
resourceGroup: xxxx
cloud: AzurePublicCloud
location: westus

Did not need to remove the node

The registration is automated. Once the installation completed, adding /etc/azure/azure.conf on all nodes

Update the master-config.yml with the following information:

kubernetesMasterConfig:
  apiServerArguments:
    cloud-config:
    - /etc/azure/azure.conf
    cloud-provider:
    - azure
    runtime-config:
    - apis/settings.k8s.io/v1alpha1=true
    storage-backend:
    - etcd3
    storage-media-type:
    - application/vnd.kubernetes.protobuf
  controllerArguments:
    cloud-provider:
    - azure
    cloud-config:
    - /etc/azure/azure.conf

restart atomic-openshift-master-api and atomic-openshift-master-controllers on the master via following commands:

systemctl restart atomic-openshift-master-api

systemctl restart atomic-openshift-master-controllers

Update all node-config.yaml as below

kubeletArguments:
  cloud-config:
  - /etc/azure/azure.conf
  cloud-provider:
  - azure

restart atomic-openshift-node on the nodes via following command

systemctl restart atomic-openshift-node

While you are starting up the atomic-openshift node, you can watch the log by running on the node

journalctl -u atomic-openshift-node -f

You should see the node get remove and re-register back with the cloud provider. There is no need to delete the node. If  the below command whiling restart the atomic-node-service, you will see the specific node get removed and added back to the cluster after sometime.

oc get nodes -w

Creating the correct storage type for your environment

You can validate your disk option before creating storageclass. If the type of storageclass does not match, the pod will throw error as it start to mount to the volume.

At the VM creation, the managed and unmanaged disk options are available.  I learned that we will need location string in the /etc/azure/azure.conf for managed disk option. It looks like RWX is not an option for Azure disk volume plug-in. There are the storageclass for unmanaged disk (top) and managed disk (bottom) I tried.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: azure-storageclass
annotations:
  storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/azure-disk
parameters:
  storageAccount: xxxxxxxx

 

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: slow2
provisioner: kubernetes.io/azure-disk
parameters:
  storageaccounttype: Standard_LRS  
  kind: Managed
  location: westus

 To set the storageclass as default

oc annotate storageclass azure-storageclass storageclass.beta.kubernetes.io/is-default-class="true"

Now, you should be able to install an OpenShift cluster on Azure with Azure storage as default storageclass.