Extending the Lightbits Cluster

The lb-csi-plugin is stateless and holds no persistent information between operations.

Kubernetes API calls that invoke the lb-csi (using Lightbits storage), require information of the Lightbits management endpoints so that the plugin can access the Lightbits cluster.

The way to update this management endpoint list to the plugin today is via the StorageClass.Parameters.mgmt-endpoint field.

The CSI API does not pass the request. Parameters in some types of API calls to the plugin, This results in situations where the plugin needs to access the Lightbits API but does not have the required information to do so.

In order to overcome this limitation, we defined an internal resource named ResourceID which holds this information as the resource ID passed to Kubernetes.

Such a ResourceID is utilized for representing the PersistentVolume.spec.volumeHandle and VolumeSnapshotContent.status.snapshotHandle resources.

ResourceID is guarantied to pass on every API call from CSI - meaning that we can use it to hold limited state.

ResourceID format is: mgmt:<host>:<port>[,<host>:<port>...]|nguid:<nguid>|proj:<proj>|scheme:<grpc|grpcs>

When the Lightbits cluster is expanded (i.e., adding/changing servers to an existing cluster), the management-endpoints list is updated, ensuring that the lb-csi-plugin can access any of the Lightbits api-servers.

This is because the PersistentVolume.spec.volumeHandle and VolumeSnapshotContent.status.snapshotHandle fields are immutable, and they cannot be changed post-resource creation.

To mitigate this problem, we provide a one-shot script that accesses a Kubernetes cluster via standard kubectl calls, and patches the following resource with the updated information:

  • StorageClass - will modify the parameters.mgmt-endpoint field with the new endpoint list.
  • PersistentVolume - will modify the resource spec.volumeHandle by using the kubectl replace call.
  • VolumeSnapshotContent - will replace the resource's spec.source.volumeHandle with the new updated spec.source.snapshotHandle - which will contain the new endpoint list.

This behavior of relying on 'ResourceID' will be fixed in future versions of lb-csi-plugin, once the Lightbits cluster supports VIP. All resources will point to a single endpoint, which will not change during Lightbits cluster updates.

This script should be idempotent, and should be safe to run on resources that were already updated.

Usage

Below is the patcher help output we provide to patch the existing resources in the cluster:

Bash
Copy

The order of the commands should be:

  1. Apply the script against all StorageClasss with the -v option. Verify that all StorageClass and PVs are updated.
  2. If there are VolumeSnapshots on the cluster, apply the script with the -s option.

Avoid operations that might access PV,PVC,VolumeSnapshots resources while running this script. Operations like replace will delete and recreate the resource with different values. As a result the StorageClass may temporarily not be accessible.

On a cluster that has existing PVs before expanding the Lightbits cluster, run the following:

Bash
Copy

This command will:

  1. Patch the StorageClass.Parameters.mgmt-endpoint with the new-comma-separated-endpoint-list value.
  2. Look up all PVs in the StorageClass, and patch the PersistentVolume.spec.volumeHandle value with the new-comma-separated-endpoint-list.

On a cluster that has VolumeSnapshots created before expanding the Lightbits cluster, run the following:

Bash
Copy

This command will:

  1. Look up all VolumeSnapshotContents in this VolumeSnapshotClass and replace the VolumeSnapshotContent.spec.source.volumeHandle value with the VolumeSnapshotContent.spec.source.snapshotHandle.

The VolumeSnapshotConten.Status.restoreSize field will be zeroed out because this is a calculated property using ListSnapshots - which is not implemented and cannot be modified via API.

This field still remains valid under VolumeSnapshot.Status.restoreSize, and if you want to try to restore this snapshot into a PVC with smaller size it will fail with the following error:

requested volume size 1073741824 is less than the size 2147483648 for the source snapshot s1

as expected.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard