How to recovery the expired certificates on OpenShift v4.6

It is a simple test evidence about OpenShift v4.6 automatic certification procedures, further information is here, Recovering from expired control plane certificates.

I made The OpenShift cluster just installed stop before first rotating new certificates, and I start the cluster for test after 48 hours for make the certificates expire. The certificates provided during installation steps will be expired within 24 hours by default as usually.

Start the stopped OpenShift cluster with expired certificates

As you see, all nodes would be “NotRead” status due to expired kubelet client certificates.

$ oc get node
ip-10-9-188-185.ap-northeast-1.compute.internal NotReady master 46h v1.19.0+d59ce34
ip-10-9-189-165.ap-northeast-1.compute.internal NotReady worker 45h v1.19.0+d59ce34
ip-10-9-189-81.ap-northeast-1.compute.internal NotReady worker 45h v1.19.0+d59ce34
ip-10-9-191-230.ap-northeast-1.compute.internal NotReady master 46h v1.19.0+d59ce34
ip-10-9-194-162.ap-northeast-1.compute.internal NotReady worker 45h v1.19.0+d59ce34
ip-10-9-215-20.ap-northeast-1.compute.internal NotReady master 46h v1.19.0+d59ce34

List all CSRs, if there is each pending node-bootstrapper CSR on each node, we are ready to recovery the certificates.

$ oc get csr
csr-2587g 3m34s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-49877 3m34s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-5z7x5 3m36s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-6kjvw 3m27s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-78wv4 3m18s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-k8k6g 3m34s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending

We can see “node-bootstrapper” CSR is “pending” as follows either.

$ oc describe csr csr-2587g
Name: csr-2587g
Labels: <none>
Annotations: <none>
CreationTimestamp: Fri, 30 Oct 2020 11:09:31 +0900
Requesting User: system:serviceaccount:openshift-machine-config-operator:node-bootstrapper
Status: Pending
Common Name: system:node:ip-10-9-189-81.ap-northeast-1.compute.internal
Serial Number:
Organization: system:nodes
Events: <none>

Approve all pending CSR to recovery

You may need to run “oc adm certificate approve” command multiple times in order to approve all the CSRs. Because CSR can be generated continuously after its timeout.

$ oc get csr -o name | xargs oc adm certificate approve approved approved approved approved approved approved approved approved

After approving all pending “node-bootstrapper” CSR, node certificates would be rotated automatically including approvement of the related node certificates CSRs.

$ oc get csr
csr-2587g 11m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-49877 11m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-5z7x5 11m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-6kjvw 10m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-78wv4 10m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-f9ln7 6m32s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-fqs44 3m8s system:node:ip-10-9-189-81.ap-northeast-1.compute.internal Approved,Issued
csr-gh4b6 6m34s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-k8k6g 11m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-kvskg 3m7s system:node:ip-10-9-189-165.ap-northeast-1.compute.internal Approved,Issued
csr-lhnpn 3m11s system:node:ip-10-9-194-162.ap-northeast-1.compute.internal Approved,Issued
csr-lr46c 3m13s system:node:ip-10-9-188-185.ap-northeast-1.compute.internal Approved,Issued
csr-mtgjr 3m7s system:node:ip-10-9-191-230.ap-northeast-1.compute.internal Approved,Issued
csr-zhkp4 3m7s system:node:ip-10-9-215-20.ap-northeast-1.compute.internal Approved,Issued

Check the node status after recovery procedures

$ oc get node
ip-10-9-188-185.ap-northeast-1.compute.internal Ready master 46h v1.19.0+d59ce34
ip-10-9-189-165.ap-northeast-1.compute.internal Ready worker 45h v1.19.0+d59ce34
ip-10-9-189-81.ap-northeast-1.compute.internal Ready worker 45h v1.19.0+d59ce34
ip-10-9-191-230.ap-northeast-1.compute.internal Ready master 46h v1.19.0+d59ce34
ip-10-9-194-162.ap-northeast-1.compute.internal Ready worker 45h v1.19.0+d59ce34
ip-10-9-215-20.ap-northeast-1.compute.internal Ready master 46h v1.19.0+d59ce34

You can see all nodes become “Ready” status after recovery procedures. It was much simpler than my expectation.

Thank you for reading.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Daein Park

Hi, there. I’m Daein. Just do something fun :) Nothing happens, if you do nothing. #OpenShift #Kubernetes #Containers #Troubleshooting #Linux #OpenSource