Migrating the Persistent Storage of OpenShift Image Registry from VMDK to NFS
Recently, I need to help a customer to migrate the persistent storage of Image Registry for the OpenShift Container Platform cluster from VMDK block disk to NFS storage.
Here are the steps that I have used to complete the migration.
Step 1) Stop the Image Registry service
Check that Image Registry is currently running with 1 replica.
$ oc get pods -n openshift-image-registry -ldocker-registry=default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
image-registry-6886f79486-sr8lf 1/1 Running 0 14d 10.131.0.58 infra1.example.com <none> <none>
Stop the Image Registry by scaling it to 0 replica.
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"replicas":0}}' --type=merge
Verify that no more Image Registry Pod is running.
$ oc get pods -n openshift-image-registry -ldocker-registry=default -o wide
No resources found in openshift-image-registry namespace.
Step 2) Identify the VMDK disk for the Image Registry
Use below command to get the name of the VMDK disk including its path.
oc get pv $(oc get pvc -n openshift-image-registry image-registry-pvc -o jsonpath='{.spec.volumeName}') -o jsonpath='{.spec.vsphereVolume.volumePath}' && echo
Below is an example of the output from the command:
[vsanDatastore] 090ab15e-64de-e0ad-a100-3868dd22ecb8/ocp1-pprmj-dynamic-pvc-5f12c6d3-466a-42db-85f5-f774f47005a5.vmdk
Step 3) Attach the VMDK disk to the NFS server (or a VM for file copying)
Suppose your NFS server is a VM running in the same vSphere cluster as the OpenShift cluster, and has access to the VMDK disk of the Image Registry, then you could add the VMDK disk to the NFS server by using vCenter web console.
If your NFS server has no access to the VMDK disk, then identify a Linux VM with such access and then add the VMDK disk to it.
This step enables you to gain access to the files stored in the Image Registry VMDK.
Step 4) Copy files to NFS server
In your NFS server or Linux file copying VM, mount the VMDK disk. For example, if the disk is in ext4
format:
mount -t ext4 /dev/sdc /mnt/registrydisk
I would suggest to run the file copying inside a tmux
session, so start a session by:
tmux
Inside the tmux
session, use rsync
command to copy the files to the NFS filesystem. For example, if the NFS export is /export/registry
:
rsync -avzh /mnt/registrydisk/docker /export/registry/
After the files have been copied, unmount the disk:
umount /mnt/registrydisk
Step 5) Detach the VMDK disk from the NFS server (or a VM for file copying)
You can now detach the disk from your NFS server via vCenter web console.
Step 6) Configure new PV and PVC for Image Registry
Delete the existing PVC and PV of the Image Registry:
PVNAME=$(oc get pvc -n openshift-image-registry image-registry-pvc -o jsonpath='{.spec.volumeName}')
oc delete pvc -n openshift-image-registry image-registry-pvc
oc delete pv ${PVNAME}
Create new PV using NFS:
cat <<EOF | oc create -f -
apiVersion: v1
kind: PersistentVolume
metadata:
name: image-registry-pv
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 500Gi
nfs:
path: /export/registry
server: <NFS Server IP>
persistentVolumeReclaimPolicy: Retain
volumeMode: Filesystem
EOF
Create new PVC using below command:
cat <<EOF | oc create -n openshift-image-registry -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: image-registry-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 500Gi
EOF
Verify the status of the PVC created. It should be bound to the PV image-registry-pv
.
$ oc get pvc -n openshift-image-registry
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
image-registry-pvc Bound image-registry-pv 500Gi RWX 19s
Step 7) Resume the Image Registry
Start the Image Registry by scaling it to 1replica.
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"replicas":1}}' --type=merge
Check that Image Registry is currently running with 1 replica.
$ oc get pods -n openshift-image-registry -ldocker-registry=default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
image-registry-6886f79486-44mkn 1/1 Running 0 9s 10.131.0.101 infra1.example.com <none> <none>
Step 8) Verify the functionality of Image Registry
Access to a OCP node:
oc debug nodes/master1.example.com
chroot /host
Login to OC CLI:
oc login -u admin -p <password> https://api.ocp1.example.com:6443
Identify an available image in the registry for testing:
sh-4.4# oc get is -n openshift-logging
NAME IMAGE REPOSITORY TAGS UPDATED
elastalert default-route-openshift-image-registry.apps.ocp1.example.com/openshift-logging/elastalert 5.5.3-v2 2 months ago
Login to the Image Registry:
podman login -u admin -p $(oc whoami -t) image-registry.openshift-image-registry.svc:5000
Test image pulling:
sh-4.4# podman pull image-registry.openshift-image-registry.svc:5000/openshift-logging/elastalert:5.5.3-v2
Trying to pull image-registry.openshift-image-registry.svc:5000/openshift-logging/elastalert:5.5.3-v2...
Getting image source signatures
Copying blob 89db40c7d523 done
Copying blob 41e4432eb425 done
Copying blob 6bb94ea9af20 done
Copying blob 0fe4ddf04ab3 done
Copying blob e5f40ba3bedd done
Copying blob ed4b2f091321 done
Copying blob c724118268e4 done
Copying blob 9504aa5c71f6 done
Copying blob 8ba4a241091d done
Copying blob 63ea6732eb2c done
Copying blob b207cf4e5b9c done
Copying blob 8a4d6f318939 done
Copying blob 7b7041617ccb done
Copying blob 633a39a4bfa5 done
Copying blob 10e5e48d9259 done
Copying blob addf95acda7d done
Copying blob 6f947118570f done
Copying blob ffef7ae5a5ff done
Copying blob c7c4c7a9aa58 done
Copying blob 1dbcab28ce46 done
Copying config 2fa1b288b9 done
Writing manifest to image destination
Storing signatures
2fa1b288b952258168e51f27e780509d20bd025cdb8dc6261d2a08f3383ca2a2
The above confirms that the Image Registry is working well. Leave the debug container by entering exit
twice.
Step 9) Scale up the Image Registry
To provide high availability and load balancing for Image Registry, scale it up to 3 replicas:
oc patch configs.imageregistry.operator.openshift.io/cluster --patch '{"spec":{"replicas":3}}' --type=merge
Verify that three Image Registry Pod could be started:
$ oc get pods -n openshift-image-registry -ldocker-registry=default -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
image-registry-6886f79486-44mkn 1/1 Running 0 11m 10.131.0.101 infra1.example.com <none> <none>
image-registry-6886f79486-flpxk 1/1 Running 0 36s 10.128.2.24 infra2.example.com <none> <none>
image-registry-6886f79486-kslqk 1/1 Running 0 36s 10.129.2.32 infra3.example.com <none> <none>