Host Server Configuration
This article details how to define configuration files for each "Ansible Host" (server) in the cluster.
Return to the /~/light-app/ansible/inventories/cluster_example
directory you created in Inventory Structure and Adding the Ansible Hosts File.
~/light-app/ansible/inventories/cluster_example
|-- cluster_example
|-- group_vars
| |-- all.yml
|-- hosts
|-- host_vars
|-- client00.yml <- This file can be ignored or deleted.
|-- server00.yml
|-- server01.yml
|-- server02.yml
From this path we will edit each of the yml files found in the ~/light-app/ansible/inventories/cluster_example/host_vars subdirectory. In our example cluster, we have three Lightbits storage nodes that are defined by the files:
host_vars/server00.yml
host_vars/server01.yml
host_vars/server02.yml
:
In each of the host variable files, update the following required variables.
Required Variables for the Host Variable File
Variable | Description |
---|---|
name | The cluster server’s name. Example: serverXX. Must match the filename (without the extension) and server names configured in the "hosts" file. |
instanceID | The configuration parameters for the logical node in this server. Currently, Lightbits supports up to two logical nodes per server. |
ec_enabled | (per logical node) Enables Erasure Coding (EC), which protects against SSD failure within the storage server by preventing IO interruption. Normal operation continues during reconstruction when a drive is removed. At least six NVMe devices must be in the node for erasure coding to be enabled. |
failure domains | (per logical node) The servers sharing a network, power supply, or physical location that are negatively affected together when there are network, power, cooling, or other critical service experience problems. Different copies of the data are stored in different FDs to keep data protected from various failures. To specify the servers in the FD, you will need to add the server names. For additional information, see Defining Failure Domains. |
data_ip | (per logical node) The data IP used to connect to other servers. Can be IPv4 or IPv6. |
storageDeviceLayout | (per logical node) Sets the SSD configuration for a node. This includes the number of initial SSD devices, the maximum number of SSDs allowed, allowance for NUMA across devices, and memory partitioning and total capacity. For additional information, see Setting the SSD Configuration. |
allowCrossNumaDevices | Leave this setting as "false" if all of the accounted NVMe drives for this instance are in the same NUMA. Set it to "true" if to access the NVME drives this instanceID will need to do cross-NUMA communication. |
deviceMatcers | This determines which NVMe drives will be considered for data and which will be ignored. For example, if the OS drive is an NVME drive, it can be ignored using the name option. The default settings do well by only counting NVMe drives greater than 300 GiB and without partitions to be part of the data. |
Filling Out Drive Information
To fill out the InitialDeviceCount
for each instance, check the NUMA placement of the NVMe drives. The command below shows which NUMA each NVMe drive belongs to. Additionally, you can update your table with this information. Note that the main example of this installation section assumes that each server has six NVMe drives in NUMA 0. This would be configured as initialDeviceCount 6 and allowCrossNumaDevices false (as all of the drives are in the same NUMA, so cross talk is not required). maxDeviceCount
is configured as 12, as there are a total of 12 available SSD slots for this NUMA.
for i in /sys/class/nvme/*/; do echo "- $i model: "`cat $i/model` sn: `cat $i/serial` numa_node: `cat $i/numa_node || cat $i/device/numa_node`; done 2> /dev/null | sort -V | nl
The example output below is unrelated to the cluster that we are demonstrating for installation. However, its output is useful in demonstrating how to interpret the command output and how to apply the configuration in a dual NUMA situation. This shows six drives in NUMA0 and six drives in NUMA1. The column after numa_node shows the NUMA ID. Based on this output, the following are the possible configurations for this server:
- One instance could be configured with an initialDeviceCount of 12 with allowCrossNumaDevices set to true (as there would be cross-NUMA communication with the drives). Assuming this server has 24 drive slots, then configure maxDeviceCount to 24.
- Configure instance 0 with initialDeviceCount of 6 and instance 1 with initialDeviceCount of 6, with both instances of allowCrossNumaDevices set to false. Assuming this server has 24 drive slots evenly split between the two NUMAs, configure the maxDeviceCount to 12 for both instances.
- Other configurations are possible as long as they are valid, do not go over the actual device count, and meet the requirements. See Host Configuration File Variables for additional information.
1 - /sys/class/nvme/nvme0/ model: INTEL SSDPF2KX038TZ sn: PHAC1501016D3P8AGN numa_node: 0
2 - /sys/class/nvme/nvme1/ model: INTEL SSDPF2KX038TZ sn: PHAC150103Z63P8AGN numa_node: 0
3 - /sys/class/nvme/nvme2/ model: INTEL SSDPE2KX040T8 sn: PHLJ830201HM4P0DGN numa_node: 0
4 - /sys/class/nvme/nvme3/ model: INTEL SSDPF2KX038TZ sn: PHAC150101253P8AGN numa_node: 0
5 - /sys/class/nvme/nvme4/ model: INTEL SSDPE2KX040T8 sn: BTLJ844409Z64P0DGN numa_node: 0
6 - /sys/class/nvme/nvme5/ model: INTEL SSDPE2KX040T8 sn: PHLJ8325031H4P0DGN numa_node: 0
7 - /sys/class/nvme/nvme6/ model: INTEL SSDPE2KX040T8 sn: PHLJ8325031F4P0DGN numa_node: 1
8 - /sys/class/nvme/nvme7/ model: INTEL SSDPE2KX040T8 sn: BTLJ844502G54P0DGN numa_node: 1
9 - /sys/class/nvme/nvme8/ model: INTEL SSDPE2KX040T8 sn: BTLJ844502GF4P0DGN numa_node: 1
10 - /sys/class/nvme/nvme9/ model: INTEL SSDPF2KX038TZ sn: PHAC1501012J3P8AGN numa_node: 1
11 - /sys/class/nvme/nvme10/ model: INTEL SSDPF2KX038TZ sn: PHAC150102Z73P8AGN numa_node: 1
12 - /sys/class/nvme/nvme11/ model: INTEL SSDPF2KX038TZ sn: PHAC1501048P3P8AGN numa_node: 1
To update these parameters, the cluster details table is useful.
Ensure that the designated drives do not contain any partitions.
Installation Planning Table Sample
The following is an example for three Lightbits servers in a cluster with a single client. Each Lightbits server comes with an initial set of six NVMe drives and can be expanded to up to 12 drives.
Server Name | Role | Management Network IP | Data NIC Interface Name | Data NIC IP | NVMe Drives |
---|---|---|---|---|---|
server00 | Lightbits Storage Server 1 | 192.168.16.22 | ens1 | 10.10.10.100 | 6 |
server01 | Lightbits Storage Server 2 | 192.168.16.92 | ens1 | 10.10.10.101 | 6 |
server02 | Lightbits Storage Server 3 | 192.168.16.32 | ens1 | 10.10.10.102 | 6 |
client00 | client | 192.168.16.45 | ens1 | 10.10.10.103 | N/A |
The following are three examples for the three host variable files.
server00.yml
name: server00
nodes:
- instanceID: 0
data_ip: 10.10.10.100
failure_domains:
- server00
ec_enabled: true
lightfieldMode: SW_LF
storageDeviceLayout:
initialDeviceCount: 6
maxDeviceCount: 12
allowCrossNumaDevices: false
deviceMatchers:
# - model =~ ".*"
- partition == false
- size >= gib(300)
# - name =~ "nvme0n1"
server01.yml
name: server01
nodes:
- instanceID: 0
data_ip: 10.10.10.101
failure_domains:
- server01
instanceID: 0
ec_enabled: true
lightfieldMode: SW_LF
storageDeviceLayout:
initialDeviceCount: 6
maxDeviceCount: 12
allowCrossNumaDevices: false
deviceMatchers:
# - model =~ ".*"
- partition == false
- size >= gib(300)
# - name =~ "nvme0n1"
server02.yml
name: server02
nodes:
- instanceID: 0
data_ip: 10.10.10.102
failure_domains:
- server02
ec_enabled: true
lightfieldMode: SW_LF
storageDeviceLayout:
initialDeviceCount: 4
maxDeviceCount: 12
allowCrossNumaDevices: false
deviceMatchers:
initialDeviceCount: 6
maxDeviceCount: 12
allowCrossNumaDevices: false
deviceMatchers:
# - model =~ ".*"
- partition == false
- size >= gib(300)
# - name =~ "nvme0n1"
- See Host Configuration File Variables for the entire list of variables available for the host variable files.
- You can also reference additional host configuration file examples.
- Typically the servers should already be configured with the data_ip. However, the Ansible playbook can configure the data NIC IP; for that you will need to add a section data_ifaces with the data interface name. For additional information, see Configuring the Data Network. The article on Setting the SSD Configuration also shows an example of this configuration.
- If you need to create a separate partition for etcd data on the boot device, see etcd Partitioning.
- Based on the placement of SSDs in the server, check if you need to make a change in the client profile to permit cross-NUMA devices.
- Starting from Version 3.1.1, data IP can be IPv6. For example:
data_ip: 2600:80b:210:440:ac0:ebff:fe8b:ebc0