Discovery-client Deployment and Usage

Overview

The discovery-client is an open-source service managed by systemd. Discovery-client plays a role in the orchestration of your Lightbits storage environment. Specifically, it serves as the mechanism for discovering nodes and volumes within Lightbits clusters.

Once discovered, it establishes and manages connections to these volumes, ensuring seamless data transfers via NVMe-over-Fabrics. The discovery-client is also designed to adapt to dynamic environments, automatically updating connections whenever nodes are added or removed from the cluster. This adaptability ensures that your host maintains effective communication with Lightbits storage clusters and its volumes.

The discovery-client performs several tasks:

  • Maintains an updated list of NVMe-over-Fabrics discovery controllers. Any changes in the Lightbits cluster controllers will automatically update this list.
  • Executes nvme discover commands against these discovery controllers to discover available nvme-over-fabrics subsystems. These commands are triggered either by an Asynchronous Event Notification (AEN) received from a remote discovery controller, or by a user-specified configuration file.
  • Automatically connects to available NVMe-over-Fabrics subsystems by running nvme connect commands.

You can configure the discovery-client via configuration files or using command-line options. This article will cover both.

Setting up Repos for RPM and Debian

For Debian-Based Systems (Ubuntu, Debian, etc.)

  1. Gather reference information about the client's OS version and codename:
Bash
Copy
  1. Install the required packages:
Bash
Copy
  1. Set the keyring location variable, depending on the OS version:
Bash
Copy
  1. Download the GPG key:
Bash
Copy
  1. Download the discovery-client repo file:
Bash
Copy

Note: If required, replace the distribution and codename with your actual operating system and codename (based on the output of lsb_release -a in the first step). For example, for Ubuntu 22, replace xenial with jammy.

  1. Update the repo:
Bash
Copy

For RPM-based systems (Red Hat, etc.)

  1. Install the prerequisites:
Bash
Copy

Continue with the instructions if pygpgme is not found. Newer releases do not require it.

  1. Download and import the GPG key:
Bash
Copy
  1. Download the repo file:
Bash
Copy
  1. Update the repo cache:
Bash
Copy

Installation

For RPM-Based Systems:

Bash
Copy

For Debian-Based Systems:

Bash
Copy

Initial Setup

  1. Check if the nvme-tcp module is loaded:
Bash
Copy
  1. If not, load it:
Bash
Copy

Use the following guide to enable the nvme-tcp module persistently through reboots: Client Module Configurations.

  1. Check if nvme-core multipath is enabled:
Bash
Copy

If enabled ("Y"), continue to the next initial setup step.

If disabled ("N"), use the following guide to enable nvme-core multipath for the current session and persistently through reboots: Client Module Configurations.

  1. Start the discovery-client service:
Bash
Copy
  1. Find or generate the host NQN:

To find it, run:

Bash
Copy

To generate one, run:

Bash
Copy

Volume Creation

From the Lightbits side, create a volume and set its ACL to the hostnqn. Additionally, obtain the cluster NQN, which will be required for connection methods 1 and 2 below.

  1. Create a volume with HostNQN ACL:
Bash
Copy
  1. Find the Lightbits cluster NQN:
Bash
Copy

Configuration For Persistence Across Reboots

You can configure the discovery-client in one of three ways.

Method 1: Using a Configuration File

This connection method will connect to a Lightbits volume based on the config file when the discovery-client service starts. Therefore if the service is enabled to start on boot, the connection will also start on boot, making it a reboot persistent connection.

Create a Configuration File

The configuration file should be placed under /etc/discovery-client/discovery.d/<name>.conf. In the file, specify the entries for the Lightbits clusters that you want to connect to. Each entry will specify the target, address, port, and other relevant details. If the directory does not exist, create it first with mkdir /etc/discovery-client/discovery.d/

Below is a sample configuration that connects to all of the current nodes of an example cluster. Technically, only one connection line is required for discovery to work. However, listing all does not have negative effects and ensures high availability and redundancy.

Bash
Copy

Where:

-t: Transport type (tcp)

-a: the IP of the Lightbits data network. All of the IPs can be retrieved from the command "lbcli list nodes" output, under the NVMe endpoint column.

-s: The port used by the discovery-service (8009).

-q: The hostnqn/ACL name of the client. You can use the value stored in the file /etc/nvme/hostnqn, or specify any string. This value must match the acl attribute set in the "lbcli create volume" command.

-n The Lightbits cluster hostnqn, This parameter must match the value from the "lbcli get cluster" command output, under the Subsystem NQN column.

Restart the Discovery-Client Service

Bash
Copy

Method 2: Using the Command Line to add hostnqn

You can use the discovery-client add-hostnqn command to establish a connection that will persist reboots. This command - like Method 1 above - will create a configuration file.

Bash
Copy

Method 3: Using the Command Line to connect-all

You can use the discovery-client connect-all command to establish a connection. This method is not reboot persistent.

The following illustrates a sample usage:

Bash
Copy

Where:

-t: Transport type (tcp)

-a: the IP of the Lightbits data network. All of the IPs can be retrieved from the command "lbcli list nodes" output, under the NVMe endpoint column.

-s: The port used by the discovery-service (8009).

-q: The hostnqn/ACL name of the client. You can use the value stored in the file /etc/nvme/hostnqn, or specify any string. This value must match the acl attribute set in the "lbcli create volume" command.

-p: The discovery connection will remain persistent after the initial volume connection - thus monitoring for changes. This is the default behavior for connection methods 1 and 2 above.

Usage

  1. List the connected Lightbits volumes:
Bash
Copy
  1. Disconnect from a given controller:
Bash
Copy
  1. Disconnect all controllers:
Bash
Copy

Logs and Troubleshooting

Logs are maintained in the /var/log/discovery-client directory. Tail the logs to identify issues:

Bash
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard