Setting the Journal SSD Configuration

Installation

The Lightbits SSD Journaling feature is currently only available for new installations from v3.17.1 release. You will need to allocate additional SSDs to be used as SSD Journal device(s).

For example, if each node in the cluster has 12 SSD devices planned, you will need an additional SSD device for journaling.

In dual-instance nodes, each instance requires its own Journal SSD device on its own numa. Therefore, in the above example, you will need to make sure that each numa has an additional SSD device for journaling.

Sizing and Performance

It is recommended to use low latency, high endurance, and write-intensive devices such as:

  • Micron 7450 pro u.3 15.36
  • Kioxia F6
  • Intel p4510
  • Solidigm (Intel) D7-P5600
  • Micron XTR SSD

These are the key parameters to look for when selecting a journaling drive:

  1. Endurance

    1. Drive Writes Per Day (DWPD)
    2. Size of the drives
  2. Sequential Write performance (throughput)

Endurance

DWPD is a metric that measures the endurance of a SSD by indicating how many times its entire capacity can be written to per day over its warranty period.

The endurance of the device is calculated by multiplying the DPWD by the size of the drive (DWPD x size). This will define how many writes a server will support until a journaling drive fails and will need to be replaced.

Regarding the endurance calculation, a drive with 10 DWPDs and a size of 1TB is the same as a drive with one DWPD and a size of 10TB.

Lightbits recommends selecting an SSD journal device where the sum of the DWPD multiplied by the size (in TB) of all journaling drives assigned to an instance exceeds 80 TB. Each Lightbits instance can utilize up to four journaling drives, and the endurance requirement per drive is proportional to the total number of journaling drives used. For example, two drives with a DWPD x size of 40 TB each, or a single drive with a DWPD x size of 80 TB, would meet this recommendation.

Here are a few examples:

  1. MICRON XTR with 960GB with published 60 DWPDs, will give you endurance of ~ 57.6 (0.96 x 60). This is an excellent selection if you deploy two drives per instance.
  2. KIOXIA CD8P-V with 12.8TB that has three DWPDs, will give you endurance of ~ 38.4 (12.8 x 3). This is a good selection if you deploy two drivers per instance
  3. Samsung PM1743 with 15.36TB that has one DWPD, will give you endurance of ~15.36 (15.36 x 1). This is not a good option for SSD Journaling, as you may be required to replace them frequently.

Sequential Write Performance

The Sequential Write performance of the device defines the peak write IOPs of a device. If your system requirements of IOPSs to the node are higher than the Sequential Write performance of the device, you can always install with two or more SSD Journal devices (up to a maximum of four).

Here are a couple of examples:

  1. MICRON XTR with 960GB has a published Sequential Write performance of 6800 MB/s, which is around 1.4M 4K IOPs.
  2. KIOXIA CD8P-V with 12.8TB has a published Sequential Write performance of 5500 MB/s, which is around 1.2M 4K IOPs.

General System Requirements

  • SSD Journaling requires mdadm (a Linux utility for managing software RAID arrays) to be installed on all servers of the cluster. During the installation process, if the mdadm package does not exist, Ansible will install it.
  • Currently, all servers within a cluster must use the same write buffer configuration, meaning that they must either all utilize DCPMM or all use SSD journaling. Hybrid configurations - where some servers use DCPMM and others use journaling within the same cluster - are not yet supported..

Configuring Global Variables in Ansible

Disable the persistent memory option in the global variables in the Ansible file all.yml:

Bash
Copy

Since the persistent_memory flag is a global property for all of the clusters, it is important to declare this flag only once under the all.yml file and not in host vars files with different values.

Setting the SSD Configuration

For each host, you will have to define the required number of Journal SSDs and the matchers to identify the devices to be used for journaling. The host-level configuration is done under the hostvars directory (e.g., host vars/server00.yml``), using the journalDeviceLayout configuration. You will have to define the number of journal devices (between 1 and 4), and the matchers to identify the journal device:

Bash
Copy

The correct way to work with matchers is to use only the specific matchers that you want to identify the SSDs with for journaling. The SSDs selected must match all. So if for example, it does not matter what model disk to use for data and for the journal, you can remove the model matcher or leave it as “.”

  • Model: The model type of SSD you want to select for journaling (e.g., model =~ "Kioxia F6*"). You can get the model from cat /sys/class/nvme/<device-name>/model.
  • Partition: Should be marked as false unless you are using a partitioned disk.
  • Size (>): The minimum size of the disk to be used for journaling. So if for example you have 1TB disks and 500GB disks and you want to use the 1TB disks for journaling, you would use the minimum size.
  • Size (<): The maximum size of the disk to be used for journaling. So if for example you have 1TB disks and 2TB disks and you want to use the 1TB disks for journaling and the 2TB disks for data, you would use the maximum size.

Example 1

In this example, all SSD devices are the same on the server for data, and for journal you just define partition == false, and the cluster will randomly select two of the devices:

Bash
Copy

For additional examples, see Host Configuration File Examples.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard