How to recovery the expired certificates on OpenShift v4.6

It is a simple test evidence about OpenShift v4.6 automatic certification procedures, further information is here, Recovering from expired control plane certificates.

I made The OpenShift cluster just installed stop before first rotating new certificates, and I start the cluster for test after 48 hours for make the certificates expire. The certificates provided during installation steps will be expired within 24 hours by default as usually.

Start the stopped OpenShift cluster with expired certificates

As you see, all nodes would be “NotRead” status due to expired kubelet client certificates.

$ oc get node
NAME STATUS ROLES AGE VERSION
ip-10-9-188-185.ap-northeast-1.compute.internal NotReady master 46h v1.19.0+d59ce34
ip-10-9-189-165.ap-northeast-1.compute.internal NotReady worker 45h v1.19.0+d59ce34
ip-10-9-189-81.ap-northeast-1.compute.internal NotReady worker 45h v1.19.0+d59ce34
ip-10-9-191-230.ap-northeast-1.compute.internal NotReady master 46h v1.19.0+d59ce34
ip-10-9-194-162.ap-northeast-1.compute.internal NotReady worker 45h v1.19.0+d59ce34
ip-10-9-215-20.ap-northeast-1.compute.internal NotReady master 46h v1.19.0+d59ce34

List all CSRs, if there is each pending node-bootstrapper CSR on each node, we are ready to recovery the certificates.

$ oc get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-2587g 3m34s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-49877 3m34s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-5z7x5 3m36s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-6kjvw 3m27s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-78wv4 3m18s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-k8k6g 3m34s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending

We can see “node-bootstrapper” CSR is “pending” as follows either.

$ oc describe csr csr-2587g
Name: csr-2587g
Labels: <none>
Annotations: <none>
CreationTimestamp: Fri, 30 Oct 2020 11:09:31 +0900
Requesting User: system:serviceaccount:openshift-machine-config-operator:node-bootstrapper
Signer: kubernetes.io/kube-apiserver-client-kubelet
Status: Pending
Subject:
Common Name: system:node:ip-10-9-189-81.ap-northeast-1.compute.internal
Serial Number:
Organization: system:nodes
Events: <none>

Approve all pending CSR to recovery

You may need to run “oc adm certificate approve” command multiple times in order to approve all the CSRs. Because CSR can be generated continuously after its timeout.

$ oc get csr -o name | xargs oc adm certificate approve 
certificatesigningrequest.certificates.k8s.io/csr-2587g approved
certificatesigningrequest.certificates.k8s.io/csr-49877 approved
certificatesigningrequest.certificates.k8s.io/csr-5z7x5 approved
certificatesigningrequest.certificates.k8s.io/csr-6kjvw approved
certificatesigningrequest.certificates.k8s.io/csr-78wv4 approved
certificatesigningrequest.certificates.k8s.io/csr-f9ln7 approved
certificatesigningrequest.certificates.k8s.io/csr-gh4b6 approved
certificatesigningrequest.certificates.k8s.io/csr-k8k6g approved

After approving all pending “node-bootstrapper” CSR, node certificates would be rotated automatically including approvement of the related node certificates CSRs.

$ oc get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-2587g 11m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-49877 11m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-5z7x5 11m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-6kjvw 10m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-78wv4 10m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-f9ln7 6m32s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-fqs44 3m8s kubernetes.io/kubelet-serving system:node:ip-10-9-189-81.ap-northeast-1.compute.internal Approved,Issued
csr-gh4b6 6m34s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-k8k6g 11m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-kvskg 3m7s kubernetes.io/kubelet-serving system:node:ip-10-9-189-165.ap-northeast-1.compute.internal Approved,Issued
csr-lhnpn 3m11s kubernetes.io/kubelet-serving system:node:ip-10-9-194-162.ap-northeast-1.compute.internal Approved,Issued
csr-lr46c 3m13s kubernetes.io/kubelet-serving system:node:ip-10-9-188-185.ap-northeast-1.compute.internal Approved,Issued
csr-mtgjr 3m7s kubernetes.io/kubelet-serving system:node:ip-10-9-191-230.ap-northeast-1.compute.internal Approved,Issued
csr-zhkp4 3m7s kubernetes.io/kubelet-serving system:node:ip-10-9-215-20.ap-northeast-1.compute.internal Approved,Issued

Check the node status after recovery procedures

$ oc get node
NAME STATUS ROLES AGE VERSION
ip-10-9-188-185.ap-northeast-1.compute.internal Ready master 46h v1.19.0+d59ce34
ip-10-9-189-165.ap-northeast-1.compute.internal Ready worker 45h v1.19.0+d59ce34
ip-10-9-189-81.ap-northeast-1.compute.internal Ready worker 45h v1.19.0+d59ce34
ip-10-9-191-230.ap-northeast-1.compute.internal Ready master 46h v1.19.0+d59ce34
ip-10-9-194-162.ap-northeast-1.compute.internal Ready worker 45h v1.19.0+d59ce34
ip-10-9-215-20.ap-northeast-1.compute.internal Ready master 46h v1.19.0+d59ce34

You can see all nodes become “Ready” status after recovery procedures. It was much simpler than my expectation.

Thank you for reading.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Daein Park

Hi, there. I’m Daein. Just do something fun :) Nothing happens, if you do nothing. #OpenShift #Kubernetes #Containers #Troubleshooting #Linux #OpenSource