Discovery-client Deployment and Usage
Overview
The discovery-client is an open-source service managed by systemd. Discovery-client plays a role in the orchestration of your Lightbits storage environment. Specifically, it serves as the mechanism for discovering nodes and volumes within Lightbits clusters.
Once discovered, it establishes and manages connections to these volumes, ensuring seamless data transfers via NVMe-over-Fabrics. The discovery-client is also designed to adapt to dynamic environments, automatically updating connections whenever nodes are added or removed from the cluster. This adaptability ensures that your host maintains effective communication with Lightbits storage clusters and its volumes.
The discovery-client performs several tasks:
- Maintains an updated list of NVMe-over-Fabrics discovery controllers. Any changes in the Lightbits cluster controllers will automatically update this list.
- Executes
nvme discover
commands against these discovery controllers to discover available nvme-over-fabrics subsystems. These commands are triggered either by an Asynchronous Event Notification (AEN) received from a remote discovery controller, or by a user-specified configuration file. - Automatically connects to available NVMe-over-Fabrics subsystems by running
nvme connect
commands.
You can configure the discovery-client via configuration files or using command-line options. This article will cover both.
Setting up Repos for RPM and Debian
For Debian-Based Systems (Ubuntu, Debian, etc.)
- Gather reference information about the client's OS version and codename:
lsb_release -a
- Install the required packages:
apt-get install -y debian-keyring # debian only
apt-get install -y debian-archive-keyring # debian only
apt-get install -y apt-transport-https
- Set the keyring location variable, depending on the OS version:
# For Debian Stretch, Ubuntu 16.04 and later
keyring_location=/usr/share/keyrings/lightbits-discovery-client-archive-keyring.gpg
# For Debian Jessie, Ubuntu 15.10 and earlier
keyring_location=/etc/apt/trusted.gpg.d/lightbits-discovery-client.gpg
- Download the GPG key:
curl -1sLf 'https://dl.lightbitslabs.com/public/discovery-client/gpg.014E5C7FAFD89AEE.key' | gpg --dearmor > ${keyring_location}
- Download the discovery-client repo file:
curl -1sLf 'https://dl.lightbitslabs.com/public/discovery-client/config.deb.txt?distro=ubuntu&codename=xenial' | sudo tee /etc/apt/sources.list.d/lightbits-discovery-client.list
Note: If required, replace the distribution and codename with your actual operating system and codename (based on the output of lsb_release -a
in the first step). For example, for Ubuntu 22, replace xenial with jammy.
- Update the repo:
apt-get update
For RPM-based systems (Red Hat, etc.)
- Install the prerequisites:
yum install -y yum-utils
yum install -y pygpgme
Continue with the instructions if pygpgme is not found. Newer releases do not require it.
- Download and import the GPG key:
rpm --import 'https://dl.lightbitslabs.com/public/discovery-client/gpg.014E5C7FAFD89AEE.key'
- Download the repo file:
curl -1sLf 'https://dl.lightbitslabs.com/public/discovery-client/config.rpm.txt?distro=el&codename=7' | sudo tee /etc/yum.repos.d/lightbits-discovery-client.repo
- Update the repo cache:
yum makecache -y --disablerepo='*' --enablerepo='lightbits-discovery-client'
Installation
For RPM-Based Systems:
yum install -y discovery-client
For Debian-Based Systems:
apt-get install -y discovery-client
Initial Setup
- Check if the nvme-tcp module is loaded:
lsmod | grep nvme
- If not, load it:
modprobe nvme-tcp
Use the following guide to enable the nvme-tcp module persistently through reboots: Client Module Configurations.
- Check if nvme-core multipath is enabled:
cat /sys/module/nvme_core/parameters/multipath
If enabled ("Y"), continue to the next initial setup step.
If disabled ("N"), use the following guide to enable nvme-core multipath for the current session and persistently through reboots: Client Module Configurations.
- Start the discovery-client service:
systemctl start discovery-client
- Find or generate the host NQN:
To find it, run:
cat /etc/nvme/hostnqn
To generate one, run:
nvme gen-hostnqn > /etc/nvme/hostnqn
Volume Creation
From the Lightbits side, create a volume and set its ACL to the hostnqn. Additionally, obtain the cluster NQN, which will be required for connection methods 1 and 2 below.
- Create a volume with HostNQN ACL:
lbcli create volume <parameters> --acl <hostnqn>
- Find the Lightbits cluster NQN:
lbcli get cluster
Configuration For Persistence Across Reboots
You can configure the discovery-client in one of three ways.
Method 1: Using a Configuration File
This connection method will connect to a Lightbits volume based on the config file when the discovery-client service starts. Therefore if the service is enabled to start on boot, the connection will also start on boot, making it a reboot persistent connection.
Create a Configuration File
The configuration file should be placed under /etc/discovery-client/discovery.d/<name>.conf
. In the file, specify the entries for the Lightbits clusters that you want to connect to. Each entry will specify the target, address, port, and other relevant details. If the directory does not exist, create it first with mkdir /etc/discovery-client/discovery.d/
Below is a sample configuration that connects to all of the current nodes of an example cluster. Technically, only one connection line is required for discovery to work. However, listing all does not have negative effects and ensures high availability and redundancy.
-t tcp -a 172.16.176.11 -s 8009 -q host1 -n nqn.2016-01.com.lightbitslabs:uuid:3715aea0-2705-4a01-9357-c6a9f8009f09
-t tcp -a 172.16.175.11 -s 8009 -q host1 -n nqn.2016-01.com.lightbitslabs:uuid:3715aea0-2705-4a01-9357-c6a9f8009f09
-t tcp -a 172.16.175.10 -s 8009 -q host1 -n nqn.2016-01.com.lightbitslabs:uuid:3715aea0-2705-4a01-9357-c6a9f8009f09
-t tcp -a 172.16.176.12 -s 8009 -q host1 -n nqn.2016-01.com.lightbitslabs:uuid:3715aea0-2705-4a01-9357-c6a9f8009f09
-t tcp -a 172.16.176.10 -s 8009 -q host1 -n nqn.2016-01.com.lightbitslabs:uuid:3715aea0-2705-4a01-9357-c6a9f8009f09
-t tcp -a 172.16.175.12 -s 8009 -q host1 -n nqn.2016-01.com.lightbitslabs:uuid:3715aea0-2705-4a01-9357-c6a9f8009f09
Where:
-t: Transport type (tcp)
-a: the IP of the Lightbits data network. All of the IPs can be retrieved from the command "lbcli list nodes" output, under the NVMe endpoint column.
-s: The port used by the discovery-service (8009).
-q: The hostnqn/ACL name of the client. You can use the value stored in the file /etc/nvme/hostnqn, or specify any string. This value must match the acl attribute set in the "lbcli create volume" command.
-n The Lightbits cluster hostnqn, This parameter must match the value from the "lbcli get cluster" command output, under the Subsystem NQN column.
Restart the Discovery-Client Service
systemctl restart discovery-client
Method 2: Using the Command Line to add hostnqn
You can use the discovery-client add-hostnqn
command to establish a connection that will persist reboots. This command - like Method 1 above - will create a configuration file.
discovery-client add-hostnqn -a <ip>:8009 -n <clusternqn> -q <hostnqn> --name <config-file-name>
Method 3: Using the Command Line to connect-all
You can use the discovery-client connect-all
command to establish a connection. This method is not reboot persistent.
The following illustrates a sample usage:
discovery-client connect-all -t tcp -a <ip> -s 8009 -q <hostnqn> -p
Where:
-t: Transport type (tcp)
-a: the IP of the Lightbits data network. All of the IPs can be retrieved from the command "lbcli list nodes" output, under the NVMe endpoint column.
-s: The port used by the discovery-service (8009).
-q: The hostnqn/ACL name of the client. You can use the value stored in the file /etc/nvme/hostnqn, or specify any string. This value must match the acl attribute set in the "lbcli create volume" command.
-p: The discovery connection will remain persistent after the initial volume connection - thus monitoring for changes. This is the default behavior for connection methods 1 and 2 above.
Usage
- List the connected Lightbits volumes:
nvme list
- Disconnect from a given controller:
discovery-client disconnect -d <dev>
- Disconnect all controllers:
discovery-client disconnect-all
Logs and Troubleshooting
Logs are maintained in the /var/log/discovery-client
directory. Tail the logs to identify issues:
tail -f /var/log/discovery-client/discovery-client.log