Connecting Application Servers to Lightbits

Connecting an application server to the volumes on the Lightbits storage server is accomplished through the following procedure.

Connecting an Application Server to a Volume

StepCommand TypeSimplified Command
Get details about Lightbits storage clusterLightbits lbcli CLI<li> lbcli get cluster</li> <li>lbcli list nodes</li>
Lightbits REST<li>get /api/v2/cluster</li> <li>get /api/v2/nodes</li>
Verify network connectivityLinux commandping <IP of Lightbits Instance>
Connect to LightbitsConnect to Lightbits clusternvme connect <your Lightbits connection details>
Review block device detailsLinux commandlsblk or nvme list

Only cluster admins have access to cluster and node level APIs. Therefore, tenant admins should get all of the required connection details from their cluster admin.

Before You Begin

Before you begin the process to connect an application server to the Lightbits storage server, confirm that the following conditions are met:

  • A volume exists on the Lightbits storage server with the correct ACL (of the client or ALLOW_ANY).
  • A TCP/IP connection exists to the Lightbits storage server.
  • If you are a tenant admin, you should get all of the connection details from your cluster admin.

Reviewing the Lightbits Storage Cluster Connection Details (Cluster Admin Only)

The following table lists the required details you need to use for the nvme connectcommand on your application server. You can retrieve this information using the nvme connect command or an lbcli command.

Required Lightbits Storage Cluster Connection Details

ItemDescriptionNVMe Connect Command Parameterlbcli Command (To Get the Information)
Subsystem NQNThe Lightbits cluster that was used to create the volume.-nlbcli get cluster
Instance IP AddressesThe IP addresses for all of the nodes in the Lightbits cluster.-albcli list nodes
TCP portsThe TCP ports used by the Lightbits cluster nodes.-slbcli list nodes
ACL stringThe ACL used when you created the volume on the Lightbits storage cluster.-qlbcli get volume (volume name)

Obtaining the Lightbits Cluster Subsystem NQN

On any Lightbits server, enter the lbcli get clustercommand.

Sample Command

Bash
Copy

Sample Output

Bash
Copy

The output includes the subsystem NQN for the Lightbits cluster.

Obtaining the Lightbits Nodes Data IP Addresses and TCP Ports

On any Lightbits server, enter thelbcli list nodes command.

Sample Command

Bash
Copy

Sample Output

Bash
Copy

The output’s NVME-Endpoint includes the Instance IP addresses and TCP ports for all the Lightbits cluster’s nodes.

Obtaining the Volume ACL String

The ACL string is the ACL you used when you created the volume on the Lightbits storage cluster.

You can also review the list of existing volumes and their ACLs by executing the lbcli list volumes or lbcli get volume on any of the Lightbits servers.

Verifying TCP/IP Connectivity

Before you run the nvme connect[ ](https://www.lightbitslabs.com/nvme-cli-overview/nvme-connect-command.html)command on the application server, enter a Linux ping command to check the TCP/IP connectivity between your application server and the Lightbits storage cluster.

Sample Command

Bash
Copy
  • rack02-server70: An application server
  • 10.23.26.8: The Instance IP address on one of the Lightbits storage cluster nodes

Sample Output

Bash
Copy

The output indicates this application server has a good connection to the Lightbits storage instance.

It is recommended to repeat this check with all the IP addresses obtained from the lbcli list nodescommand.

Connecting to the Lightbits Cluster

With the IP, port, subsystem NQN and ACL values for your volume, you can execute the NVMe Connect Command.

You must repeat the nvme connect command for each of the NVMe endpoints retrieved by the lbcli List Nodes command.

Sample NVMe Connect Command

Bash
Copy
  • Use the client procedure for each node in the cluster. Remember to use the correct NVME-Endpoint for each node.
  • Add the --ctrl-loss-tmo -1 flag to allow infinite attempts to reconnect nodes. This prevents a timeout from occuring when attempting to connect with a node in a failure state.
  • During the connection phase to a client, the system can crash if you use NVMe/TCP drivers that are not supported by Lightbits.

For more details on the NVMe CLI, see the NVMe CLI Overview section of this document.

Currently, Lightbits only supports TCP for the transport type value.

The above connect command will connect you to the primary node where the volume is. It is recommended to have the discovery client installed on all the clients. This will automatically pull the required information from the cluster (or from several clusters), discover all the volumes the client has access to, and maintain high availability so that if the primary fails, the optimized NMVe/TCP path will go to the new primary. See the Discovery Client Deployment section for more information.

After you have entered the nvme connect command, you can confirm the client’s connection to Lightbits cluster by entering the nvme list-subsys command.

Reviewing Block Device Details on the Application Server

After the nvme connect command completes, you can see the available block devices on the application server using the Linux lsblk command, or the nvme connectcommand.

The following example shows how to use the Linux lsblkcommand to list all block devices after the nvme connectcommand has been executed. This command will show a list of all block devices on the client and the block devices the client can connect to (all the volumes for which the client is part of their ACL and all volumes that are ALLOW_ANY).

Sample Command

Bash
Copy

Sample Output

Bash
Copy

In this example output, you can see the 10GB NVMe/TCP device with the name nvme0n1. This name indicates the device is:

  • From the NVMe subsystem
  • The first volume on the NVMe subsystem 0

Your Lightbits storage cluster is now initialized and ready for use.

You can configure your applications to use this NVMe/TCP connected volume as you would with any other Linux block device.

NVMe/TCP MultiPath

NVMe multipath I/O refers to two or more independent paths between a single host and a namespace. Each path uses its controller, although multiple controllers can share a subsystem port. Multipath I/O like namespace sharing requires that the NVM subsystem contains two or more controllers.

Multipath is part of NVMe specification and is used by the Lightbits cluster software as follows:

  1. The primary node exposes the path to the volume.
  2. Clients send read and write requests to the primary node.
  3. The primary node replicates to the secondary nodes.
  4. If the primary node fails, the secondary node exposes a path to the client so the client can continue working with the secondary node.

Lightbits uses a proprietary protocol on top of TCP to replicate data between primary and secondary nodes, without requiring any changes to the client.

The default CF stack will not deploy any client machine in your environment. In order to test the functionalities and performance of your AWS-based Lightbits cluster, you will therefore need to deploy an AWS instance with an operating system that supports NVMe over TCP via the adequate nvme_tcp kernel module.

Such distributions include:

  • Ubuntu
  • RHEL
  • Amazon Linux
  • Rocky

Make sure your kernel version has a fully functional nvme_tcp kernel module. Lightbits recommends kernel version 5.4 and above.

The following is an example of a sequence of commands to test client connectivity:

  1. Connect to one of your storage instances using Session Manager (SSH).
  2. Get the system JWT token.
  3. List nodes for status.
  4. Create a test volume using lbcli create volume.

In order to connect to a storage instance, go to the EC2 instances dashboard and select one of the lightbits-node instances. Then click Connect > Session Manager.

Within the session manager (SSH) window:

Bash
Copy

You can copy the JWT and add it to the file located at /etc/lbcli/lbcli. yaml.

Example:

YAML
Copy

This will allow you to perform further lbcli management commands to the storage cluster without specifying each JWT command.

  1. Create a client instance within the same VPC as the storage cluster.
  2. Connect to the client using SSH.
  3. Load the nvme-tcp module.
  4. Discover the volume with discovery-service via the NLB (Load Balancer) URL.
  5. Fio r/w test.
YAML
Copy

Node Rebuild

The Lightbits cluster has the ability to rebuild data on a node if it is not in sync with other replications on other nodes. The cluster will identify that there is a node that is not in sync and will trigger the rebuild process. This could happen due to connectivity issues to the node, software issues that caused the node to stop responding, or restart of an instance.

The node will decide whether to perform a partial rebuild - usually after a short disruption or reboot, or a full rebuild in cases of prolonged downtime. During the rebuild process, all volumes that have a replication on the affected node will be in degraded mode. Once the rebuild is done, all volumes will return to be fully available.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard