Node Failure and Volume Protection State
If a node fails, volumes that have data stored on that node can be affected. For a volume with a replication factor of 3, a single node failure can cause the volume protection state to become Degraded. If another node fails, the volume’s state may become ReadOnly.
Although it is not recommended in Azure, if you created RF2 volumes and one node fails, they will become ReadOnly with a single node failure, and RF1 volumes will become Inaccessible if the node that the data is on fails.
In case all nodes that hold a volume’s replica fail, the volume becomes Inaccessible.
You can view the volume’s protection state by issuing the lbcli List Volumes command.
Sample Command
$ lbcli -J $LIGHTOS_JWT --project-name=a list volumes
Sample Output
Name UUID Protection State State Size Replicas ACL
vol1 76c3eae8 FullyProtected Created 200 GiB 3 values:"acl1"
vol2 3f3c3ad2 Degraded Created 200 GiB 3 values:"acl2"
vol3 8700cba8 ReadOnly Created 200 GiB 1 values:"acl3"
As you can see in the output, vol2 and vol3 are not in a FullyProtected volume protection state.
Now, you can use the lbcli list nodes command to identify which node has failed (the Cluster Admin command). In this command’s output you will see one of the following node states:
Node State | Description |
---|---|
Activating | Node is being activated and is currently unable to serve IOs. This state can occur after a node is reconnected to the network, coming up from reboot, or recovering from any other failure state. After the activation is complete, the node’s state transitions to Active. |
Active | Node is active and can serve IOs. |
Deactivating | Node failure is detected and the Lightbits cluster software is changing the roles of other nodes in the cluster to keep data accessible. |
Sample Command
$ lbcli -J $LIGHTOS_JWT list nodes
Sample Output
NAME UUID State NVME-Endpoint
server00-0 192af7c0-d39f-4872-b849-7eb3dc0f7b53 Active 10.23.26.13:4420
server01-0 1f4ef0ce-0634-47c7-9e5f-d4fd910ff376 Active 10.23.26.8:4420
server02-0 6d9b8337-18cd-4b14-bea1-f56aca213d68 Inactive 10.23.26.4:4420