Post-Installation Steps
After a successful Lightbits cluster installation, perform the following steps.
Back Up Important Content
Back up the following contents to a secure location. This will be useful if another node is added in the future or other troubleshooting is required.
- Back up the ~/light-app directory and all of its contents. The contents of the Ansible directory will be helpful in the future if there is a need to add servers or check over how a previous installation was done.
- Back up the generated JWTs. Back up the system-jwt and the default project jwt. By default they are placed in the home directory: ~/lightos-system-jwt and ~/lightos-default-admin-jwt.
- Back up the generated certificates. By default these are placed in ~/lightos-certificates.
Check Cluster Health
- Copy the contents of the system jwt file ~/lightos-system-jwt into the clipboard.
From the Ansible host, run cat ~/lightos-system-jwt; echo;
. Note that we add the last echo
to generate a new line at the end, so that it is easy to determine where the JWT ends.
The contents will be similar to this:
export LIGHTOS_JWT=eyJhbGciOiJSUzI1Ni<...CONTENTS OF JWT...>5PcYPBRBaFEuMsT9gQNQA
The contents of a JWT are long and all are on a single line.
- Log in to any Lightbits server and paste the contents into the shell. The JWT will now be available via the $LIGHTOS_JWT environment variable.
- Check the state of the servers, nodes, and cluster.
The servers look healthy as they all state "NoRiskOfServiceLoss":
server00:~$ lbcli -J $LIGHTOS_JWT list servers
NAME UUID State RiskOfServiceLoss State LightOSVersion
server00 5c2b7375-64fa-583e-8ebe-e82b8e1e1a53 Enabled NoRiskOfServiceLoss 3.1.2~b1125
server01 bb5433d2-9740-5130-a5eb-47f623c19b4d Enabled NoRiskOfServiceLoss 3.1.2~b1125
server02 a9097bf0-6005-5dca-8f07-eeaca114ec70 Enabled NoRiskOfServiceLoss 3.1.2~b1125
The nodes also look healthy, as they all state to be "Active":
server00:~$ lbcli -J $LIGHTOS_JWT list nodes
Name UUID State NVMe endpoint Failure domains Local rebuild progress
server00-0 d2c336f2-c9e5-5a4c-951e-ede739d10774 Active 10.10.10.100:4420 [server00] None
server01-0 810ed593-a97b-5495-b08a-be5e37a65f82 Active 10.10.10.101:4420 [server01] None
server02-0 2cf6ce67-fe5e-5a98-86bf-98a5792a8916 Active 10.10.10.102:4420 [server02] None
Also check that the cluster health state is ok:
server00:~$ lbcli -J $LIGHTOS_JWT get cluster -o yaml
ETag: "0"
MaxAllowedVersion: 3.2.X
MinAllowedVersion: 3.1.X
MinVersionInCluster: 3.1.2~b1125
UUID: c2719be6-00b8-4e96-b80a-6a84ecc3e638
apiEndpoints:
- 10.10.10.100:443
- 192.168.16.22:443
- 10.10.10.101:443
- 192.168.16.92:443
- 10.10.10.102:443
- 192.168.16.32:443
clusterName: c2719be6-00b8-4e96-b80a-6a84ecc3e638
currentMaxReplicas: 3
discoveryEndpoints:
- 10.10.10.100:8009
- 10.10.10.101:8009
- 10.10.10.102:8009
health:
numDegradedVolumes: 0
numInactiveNodes: 0
numNotAvailableVolumes: 0
numReadOnlyVolumes: 0
state: OK <------------------------------------------ health state is OK
statistics:
compressionRatio: 1
effectivePhysicalStorage: "52311333273"
estimatedFreeLogicalStorage: "41226327633"
estimatedLogicalStorage: "51963745873"
freePhysicalStorage: "41490028953"
installedPhysicalStorage: "94489280512"
logicalStorage: "16106127360"
logicalUsedStorage: "10737418240"
managedPhysicalStorage: "60129542144"
physicalUsedStorage: "10821304320"
physicalUsedStorageIncludingParity: "10821304320"
subsystemNQN: nqn.2016-01.com.lightbitslabs:uuid:2304a078-e5d0-40e8-94d6-3656d24d1337:suffix
supportedMaxReplicas: 3
At this point the cluster's health has been confirmed at the node, server, and cluster level.