Release 3.17.1

Release Date

v3.17.1 was released to the public on December 1, 2025.

New in This Release

This release introduces the following changes since version 3.16.x. A change is classified as either a new feature, an enhancement, a major issue (e.g., an issue that could lead to potential data loss or service loss), or a minor issue.

Issue TypeDescriptionID
New feature GA

Introduced Write Buffer Journaling to expand hardware flexibility, while maintaining high reliability for mission-critical environments.

This new capability removes the dependency on specialized hardware by enabling consistent data protection and durability using standard server platforms. It simplifies procurement, reduces infrastructure cost, and broadens deployment options - particularly valuable for scaling AI, analytics, and cloud services.

LBM1-41367
New feature GA

Added support for NVMe/TCP In-Band Authentication (DH-HMAC-CHAP) to strengthen security for host-to-cluster connectivity. This capability enables authenticated NVMe/TCP sessions using industry-standard DH-HMAC-CHAP, ensuring that only verified hosts can establish I/O paths to the cluster. It improves security posture in multi-tenant, regulated, and large-scale environments without introducing operational complexity. The implementation is fully aligned with the NVMe 2.0 specification and supports seamless deployment across both existing and new clusters.

Important: In this initial version, the auxiliary sub-system mode and in-band authentication are mutually exclusive and cannot be used at the same time. The ability to use both together is under active development and is planned for a future release.

LBM1-41366
Enhancement Updated the RPM package naming convention to include architecture and distribution tags (e.g., .el9.x86_64). The new format is [package-name]-[version][build-identifier]-[release-number].[distro].[arch].rpm. This change improves clarity and aligns with Red Hat's official documentation.LBM1-38037
Enhancement Ansible capabilities have been extended to support a configurable JWT expiration date during deployment, allowing a new variable to override the default 365-day period.LBM1-39990
Enhancement ansible: IPACL explicit flag was extended to support IPv6, in addition to the existing IPv4 addresses.LBM1-37516
Enhancement api-service: Enhanced system observability by extending the 'get/list nvmeDevice' and 'get/list node' API and lbcli commands. The 'get/list nvmeDevice' commands now display the device usage (e.g., Journal, Data, OS, Unmanaged), and the 'get/list node' commands now show the SSD journaling device type (e.g., NVMe, RAID).LBM1-33337
Enhancement api-service: API service now supports IPv6 addresses for connected hosts.LBM1-38762
Enhancement CSI: Our CSI Docker images built on RHEL UBI now include 'UBI' in their names, providing clearer identification for these container images.LBM1-38338
Enhancement data-layer: To further enhance data consistency, the synchronization process for nodes reconnecting after a network disruption or server restart/downtime has been improved. This ensures that all pending operations, such as volume and snapshot deletions, are correctly applied and all nodes are properly synchronized - maintaining the proper protection state for all volumes.LBM1-31156
Enhancement discovery-client: Contributed to service reliability and high availability by preventing a silent connection failure. The discovery-client will now detect if existing NVMe controllers are present and automatically use their hostid, overriding any conflicting hostid in the configuration. This fixes a rare misconfiguration issue that could prevent new I/O paths from being established.LBM1-40087
Enhancement discovery-client: Enhanced security and automation by adding support for pre-configuring DH-CHAP secrets in the discovery-client.yaml file. The discovery-client will now automatically use these secrets for authentication when establishing connections, contributing to more secure and reliable automated deployments.LBM1-38858
Enhancement duroslight: Added new performance counters to the duroslight log to provide visibility into active rebuild traffic. This enhancement allows for better monitoring of rebuild progress and contributes to better observability and management of the system.LBM1-21281
Enhancement duroslight: Improved NVMe connection logging to reduce verbosity, while providing more useful diagnostics information.LBM1-39324
Enhancement duroslight: Support for enabling multiple sub-systems has been added to extend the number of clients that can connect to a Lightbits cluster. This new capability directs each client to a pre-configured sub-system, allowing for a substantial increase in overall client connection capacity. Important: In this initial version, the auxiliary sub-system mode and in-band authentication are mutually exclusive and cannot be used at the same time. The ability to use both together is under active development and is planned for a future release.LBM1-38254
Enhancement duroslight: When the number of metrics is high, metric collection can stall the reactor, resulting in increased max/tail latencies for specific workloads. The collection of metrics will now yield so the IO latency is not impacted.LBM1-38857
Enhancement lbcli: Added a deviceUsagecolumn to the output of list nvmeDevices, showing the role of each device (data, journal, os, or unmanaged). This improves usability by making it easier to identify device types directly from the table view without needing extra lookups.LBM1-38304
Enhancement nvme-cli: To provide the most current features and security enhancements, Lightbits now relies on the official, industry-standard (upstream) nvme-cli package. This ensures alignment with upstream development and community support.LBM1-38047
Enhancement The Cluster Manager now immediately recognizes permanent node failure when a node becomes inactive due to consecutive disk failures, streamlining system response to critical hardware issues.LBM1-35805
Enhancement userlbe: Improved reclamation of storage consumed by TRIMs of deleted volumes.LBM1-35875
Enhancement With improved support for deployments with over 2K clients, the discovery client now also offers the ability to append a suffix to the NQN (NVMe Qualified Name) received during discover.LBM1-38341
Enhancement Fixed the internal metrics descriptors function to support auto-generated docs.LBM1-35765
Enhancement Set enable_chrony=true by default to ensure that the time-sync agent is always installed as part of the Lightbits installation.LBM1-38595
Major duroslight: Contributed to data integrity and system reliability by fixing a rare issue where recent data could be lost. This could occur if the most recent snapshot was deleted while a replica node was offline, potentially causing the new data to be reverted to the snapshot's older data during a rebuild.LBM1-39211
Major duroslight: Further improved data integrity in DCPMM environments by resolving a rare issue related to taking a snapshot after an abrupt failure. This enhancement ensures that recently written data is consistently preserved, preventing unintended reversion to prior snapshot values, even in these infrequent scenarios.LBM1-39168
Major duroslight: To further enhance system stability, Intel's Data Streaming Accelerator (DSA) support is now disabled by default. This precautionary measure prevents potential hardware exceptions, ensuring more reliable operation.LBM1-39628
Major node-manager: Contributed to the overall stability of maintenance operations by resolving a rare deadlock. This fix prevents a node from hanging during shutdown if a new shutdown command is issued before a previous startup has fully completed.LBM1-39872
Major node-manager: The Node Manager now correctly processes configurations when health_state_timeout_ms is specified in Ansible variables, resolving a potential decoding issue with multi-line duroslight/conf.yaml files. This enhancement ensures greater reliability during deployments that use custom configuration settings.LBM1-39676
Major node-manager: To further enhance system stability, a rare case of a potential deadlock in the node's shutdown flow has been resolved. This addresses scenarios where a shutdown is initiated before the node's power-up sequence has fully completed, ensuring more robust behavior during rapid state transitions.LBM1-38754
Major nvme: Contributed to overall system reliability by preventing a node from sending I/O errors to a client after it had already been marked as inactive. This fixes a rare race condition, ensuring a cleaner and more stable client failover during a node failure.LBM1-39661
Major userlbe: Contributed to the overall stability and reliability by fixing a rare case of system halt when deleting a snapshot on TRIM-enabled systems.LBM1-39336
Major userlbe: GFTL allocations are now NUMA-interleaved. This prevents rare cases of single-node memory exhaustion.LBM1-39819
Major userlbe: Improved write performance for systems with TRIM enabled.LBM1-39560
Major Fixed a bug reflecting the new changes in dashboards during the upgrade flow.LBM1-35676
Minor api-service: Resolved an API service crash that could occur when listing cluster or node statistics while they were being updated. This action enhances the stability and reliability of these operations.LBM1-38600
Minor api-service: The DEKsRetentionPeriod cluster configuration key has been renamed to DeksRetentionPeriodto align with camel case conversion standards, enhancing naming consistency within the API service.LBM1-39727
Minor api-service: The GetClusterConfigParam API now includes more robust case handling for the EnableTrimparameter, improving efficiency and accuracy during cluster configuration retrieval.LBM1-39731
Minor cluster-manager: Extended the capability of the cluster encryption feature to support IPv6-only environments. This change resolves an IP address formatting issue, contributing to the feature's reliability and compatibility across all supported network configurations.LBM1-37460
Minor cluster-manager: The Cluster Manager now handles the node replacement process with greater stability and predictability. This enhancement resolves several underlying race conditions to ensure consistent and expected behavior.LBM1-39119
Minor discovery-service: Fixed an issue where Asynchronous Event Notifications (AEN) from the discovery service were missing during IP Access Control List (IPACL) updates, potentially preventing a client from connecting automatically as expected.LBM1-39002
Minor duroslight: For more precise performance analysis, the wrlat_xmit latency metric was added. This new measurement captures the time taken to send a Ready-to-Transfer (R2T) packet, providing valuable insight into the I/O path after internal resource gating and QoS delays, but before the client begins its data transmission.LBM1-37206
Minor duroslight: Resolved a very rare edge case that could cause the duroslight process to hang during graceful node shutdown if the primary node is also shutdown/crashed at the same time. This enhancement specifically improves stability during concurrent rebuild operations, even when the rebuild primary node failed just moments before.LBM1-39356
Minor The Node Manager has been optimized to avoid redundant CPU cycles and unnecessary logging within the control plane during extended rebuild processes of degraded protection groups. This enhancement reduces system overhead and improves overall efficiency during critical recovery operations.LBM1-38723
Minor The wrlat_reply_qued statistic in Duroslight has been refined to accurately reflect only write I/O operations, excluding other NVMe commands. This improvement provides a more precise measure for analyzing write latency.LBM1-38751
Minor lbcli: For improved consistency and user flexibility, lbcli now accepts both "trusted-host-secrets" and "trusted-host-secret" when using both set and get commands.LBM1-38366
Minor node-manager: Fixed a rare panic that could occur during a graceful node shutdown. This resolves a race condition where a graceful shutdown could unexpectedly turn into an abrupt one, thus contributing to the overall stability and reliability of the cluster during maintenance operations.LBM1-38853
Minor Security is paramount, and this change ensures that firewall rules are now correctly applied for each specified IP when creating an admin-endpoint, contributing to overall security.LBM1-39869
Minor System reliability has been further enhanced by resolving multiple underlying rare race conditions. This improvement ensures that the health status of each node is always reported with accuracy and that transitions between states - such as active and inactive - are handled consistently.LBM1-38932
Minor userlbe: Contributed to data integrity and system reliability by fixing a rare issue where a volume could get stuck in a "Deleting" state. This could occur if a volume was deleted and then quickly recreated before the deletion fully completed, causing the affected volume replica to silently ignore all incoming writes. This fix ensures that the volume state is correctly handled, preventing data inconsistency.LBM1-40243

Installation and Upgradeability

You can upgrade to this release from all previous Lightbits 3.14.x, 3.15.x, and 3.16.x releases.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard