Volume Management

Understanding a Volume Location in the Lightbits Cluster

The basic work with a block device is to create and manage volumes. This section explains the volumes in a Lightbits cluster - including how to identify the placement of the volumes and how a node failure impacts the volume protection state.

You can use the lbcli List Volumes command to see which node in the cluster holds the data of a specific volume. Note that volume placement may change during the life cycle of a volume due to dynamic rebalancing.

Sample Command

Bash
    
$ lbcli -J $LIGHTOS_JWT get volume --uuid=48dbbe54-1548-4444-866e-6438d4877e5f (Project Admin should add the --project-name=<project-name>)
Copy

Sample Output

YAML
    
 
name: vol_3 rebuildProgress: None UUID: 48dbbe54-1548-4444-866e-6438d4877e5f Acl:   Values:   - hostnqn1 nsid: 4 size: "107374182400" nodeList: - 9a625dbf-de1b-4211-b9d3-0bdf70faa5f5 - 9f1a3a85-5be5-4f98-a44e-4271fbdbe7cc - a7f67ad0-1f3d-49cb-9650-b82042378014 protectionState: FullyProtected replicaCount: 3 state: Created
Copy

You can also run the nvme list-subsys command on the application server, on a specific block-device that corresponds to a volume on the cluster.

Sample Command

Bash
    
 
nvme list-subsys nvme0n1
Copy

Sample Output

Bash
    
 
nvme-subsys0 - NQN=nqn.2014-08.org.nvmexpress:NVMf:uuid:b550533c-8bb0-46df-9019-cd4c25c6e6e7 +- nvme0 tcp traddr=10.18.38.4 trsvcid=4420 live +- nvme1 tcp traddr=10.18.38.5 trsvcid=4420 live +- nvme2 tcp traddr=10.18.38.7 trsvcid=4420 live inaccessible +- nvme3 tcp traddr=10.18.38.8 trsvcid=4420 live optimized +- nvme4 tcp traddr=10.18.38.29 trsvcid=4420 live inaccessible
Copy

This command’s output includes the cluster’s primary node (optimized) IP address for the volume, the secondary node (inaccessible) IP address, and the other nodes that do not hold data for that volume (block device nvme0n1).

Volume Placement

With Volume Placement, you can specify which failure domain to place a volume on upon its creation. The idea behind volume placement is to statically define, separate, and manage how volumes and data can tolerate a failure and provide availability - through failure of a server, rack, row, power grid, etc. By default, Lightbits will define server level failure domains.

Note that this feature is associated with the create volume command.

Current feature limitations of Volume Placement include the following:

Available only for single replica volumes.
Only failure domains labels are currently matched (e.g., fd:server00).
Up to 25 affinities can be specified.
Value (failure domain name) is limited to up to 100 characters.
Volume Placement cannot be specified for a clone (a clone is always placed on the same nodes as the parent volume/snapshot).
Dynamic Rebalancing must be disabled.
The entire Lightbits cluster must be upgraded to at least release version 2.3.8.

In Lightbits 2.3.8 and above, a new flag is available - 'placement-affinity' - which can be used as follows:

lbcli create volume

Bash
    
lbcli -J $JWT create volume ----name=vol1 --acl=acl1 --size="4 Gib" --replica-count=1 --placement-affinity="fd:Server0|fd:rack1|fd:rack0"
Copy

In the example above, you can ask the system to place the volume as follows:

On a node that includes Server0 in its failure domains. Note that the server name is used as a default failure domain configuration in node-manager.yaml. You will be responsible for determining if you want anything else other than the default yaml that Lightbits provides.
On a node that includes rack1 in its failure domains.
On a node that includes rack0 in its failure domains.

If the requirement cannot be satisfied because Lightbits did not find such a node, Lightbits will fail the request.

If Lightbits cannot find active nodes with failure domains that match the volume placement request, create volume will fail and will not place the volume on other nodes.

Node Failure and Volume Protection State

The Lightbits cluster software continuously monitors the cluster nodes' health and connectivity and responds to changes in the nodes’ status. In AWS, the Auto Healing feature will take preventive measures in case of AWS notification of an instance outage, and remedial measures on abrupt failure. See the Auto Maintenance Overview section for additional information.

If a node fails, volumes that have data stored on that node can be affected. For a volume with a replication factor of 3, a single node failure may cause the volume protection state to become Degraded. If another node fails, the volume’s state may become ReadOnly.

Although it is not recommended in AWS, RF2 volumes will become ReadOnly with a single node failure, and RF1 volumes will become Inaccessible if the node the data is on fails.

In case all nodes that hold a volume’s replica fail, the volume becomes Inaccessible.

You can view the volume’s protection state by issuing the lbcli List Volumes command.

Sample Command

Bash
    
 
$ lbcli -J $LIGHTOS_JWT --project-name=a list volumes
Copy

Sample Output

Bash
    
 
Name UUID Protection State State Size Replicas ACL vol1 76c3eae8 FullyProtected Created 200 GiB 3 values:"acl1" vol2 3f3c3ad2 Degraded Created 200 GiB 3 values:"acl2" vol3 8700cba8 ReadOnly Created 200 GiB 1 values:"acl3"
Copy

As you can see in the output, vol2 and vol3 are not in a FullyProtected volume protection state.

Now, you can use the lbcli List Nodes command to identify which node has failed (the Cluster Admin command). In this command’s output you will see one of the following node states:

Sample Command

Bash
    
 
$ lbcli -J $LIGHTOS_JWT list nodes
Copy

Sample Output

Bash
    
 
NAME UUID State NVME-Endpoint server00-0 192af7c0-d39f-4872-b849-7eb3dc0f7b53 Active 10.23.26.13:4420 server01-0 1f4ef0ce-0634-47c7-9e5f-d4fd910ff376 Active 10.23.26.8:4420 server02-0 6d9b8337-18cd-4b14-bea1-f56aca213d68 Inactive 10.23.26.4:4420
Copy

Last updated on

Was this page helpful?