Release 3.15.1
Release Date
v3.15.1 was released to the public on June 03, 2025.
New in This Release
This release introduces the following changes since version 3.14.x. A change is classified as either a new feature, an enhancement, a major issue (e.g., an issue that could lead to potential data loss or service loss), or a minor issue.
Issue Type | Description | ID |
---|---|---|
Added API-driven rotation of the cluster root encryption key (KEK). Invoking the API will create a new KEK and re-encrypt the existing DEKs with the new KEK. See the relevant documentation in the Cluster-Level Encryption article. | LBM1-33506 | |
Upgraded etcd to version 3.5.18, which has enhanced security and bug fixes. Note: The "DB Size" graph in "ETCD by Prometheus" dashboards in Grafana relies on a deprecated metric. Use the updated dashboard provided in the 3.15.1 release for correct operation of this graph. | LBM1-36732 | |
Enhanced the NVMe deallocate implementation to improve capacity utilization. This enhancement integrates with the file system’s TRIM command, enabling the operating system to identify and release unused data blocks. Refer to the Lightbits TRIM Support documentation to enable this capability. Note that this feature is in tech preview, which means it should not be used in production setups. | LBM1-35258 | |
api-service : The replace node API will now fail at the api-service, if cluster encryption is enabled while the destination server is still not encryption enabled. | LBM1-36477 | |
build : Updated to golang 1.24.1. | LBM1-36917 | |
duroslight : Removed the timestamp from journal log entries (as the journal itself already provides them). | LBM1-36921 | |
node-manager : Automatically calculate the required timeout for node powerup, based on node capacity. The timeout was previously a fixed value. | LBM1-36874 | |
node-manager : Enabled selecting NVMe devices by serial number, as device paths are not stable and could change across reboots. | LBM1-36722 | |
prometheus : Updated Prometheus' record.rules.yaml to calculate the metric for total TCP connections per node. | LBM1-36609 | |
tpm : Get the TPM path dynamically, and do not use a hardcoded /dev/tpm0. | LBM1-34675 | |
cluster-manager : Eliminated redundant short rebuilds following a primary switch or a primary failover. Such rebuilds created an unnecessary single point of failure; i.e., additional failures during this short rebuild could cause some volumes to become Unavailable or Read Only - with both cases resulting in a loss of service. | LBM1-13327 | |
cluster-manager/duroslight : Fixed a race condition where, during the graceful shutdown of a node, incorrect node connectivity events could be sent. This could cause Lightbits to mistakenly assume a healthy server was disconnected and mark it as inactive. | LBM1-32958 | |
duroslight : Fixed a rare race condition during node recovery, which trips an assertion causing duroslight to crash. | LBM1-37085 | |
node-manager : Added an extra safety measure to protect against having more than a single accessible path per volume. Prior to this change, following a primary-switch, new primaries would wait for the old ones to disable themselves before marking themselves as optimized paths. However, they would only wait the amount of time it would take a node to internally consider itself as self-failed. If a node setting itself as a not optimized path encounters some issue that makes it slow/non-responsive to the relevant update, two optimized paths may be exposed. | LBM1-36651 | |
node-manager : Node-manager did not attempt to restart GFTL.service if systemd failed to start the service. This fix introduced a retry mechanism. | LBM1-37361 | |
node-manager : Fixed an issue where a server failed to power up when configured with a single instance across multiple NUMA nodes, and all SSDs are located in the second NUMA. This issue could only be triggered during upgrades or addition of new servers. | LBM1-37114 | |
userlbe : Fixed a bug where IO errors during abrupt recovery were handled incorrectly - thereby causing recovery to fail. | LBM1-37205 | |
api-service : Prior to this modification, the customer's IdP server would be accessed as soon as the IdP configuration was created, regardless of whether the Federated Authentication feature was enabled. This fix ensures that the IdP server is only accessed if both the IdP configuration is in place and the Federated Authentication feature is enabled. | LBM1-35674 | |
cluster-manager : Fixed an issue that could cause an event to incorrectly indicate that migrating volumes are degraded when in fact they are fully protected. | LBM1-33865 | |
data-layer : Fixed an issue when deleting servers that could cause a delete server task to hang around after the server was deleted. | LBM1-37395 | |
data-layer : Fixed a rare issue that caused the rebuild progress to be reported as 1% instead of the actual rebuild progress. | LBM1-37389 | |
data-layer : Resolved an issue preventing proactive rebalance from selecting target nodes that previously had permanent failures and proactive rebalance involving volumes that had snapshots - even after a long period of time - due to certain conditions that caused obsolete snapshot keys to still exist in etcd even though their corresponding data was moved or deleted. | LBM1-34858 | |
discovery-client : Fixed an issue that could cause nvme discover to crash with a nil pointer exception when receiving malformed responses. | LBM1-36790 | |
discovery-client : Passed the host's hostid when issuing nvme connect. Previously discovery could fail because Lightbits did not pass this value, and the kernel would send a random value, which could cause "nvme_fabrics: found same hostnqn <...> but different hostid <...> when using both the discovery-client and nvme-cli in parallel. | LBM1-36644 | |
| LBM1-36282 | |
lightbits-api : Fixed the listHosts and listVolumes APIs, which previously indicated hosts were not in volume IPACL as connected to the volume, although they were not connected. | LBM1-36964 | |
userlbe : Fixed a race condition that, in rare scenarios, could cause a node to perform an abrupt power-up instead of a graceful power-up. | LBM1-37264 | |
userlbe : Fixed a potential crash when adding a new disk during system runtime. | LBM1-37070 | |
userlbe : Fixed a GFTL (lbe) crash due to a rare race condition between an SSD failure/removal and SSD read submissions by UNITREADER: observed only with multiple concurrent volume/node rebuilds running. The problem signature in the log is: "IO was partially done. Handling not implemented yet (completed: 0 bytes..." followed by "Assert failed on:(unit->is gc_read_unit)". | LBM1-36882 | |
userlbe : Fixed a rare issue that could only happen on VMs with very little storage, where recovery could crash with a start position equal to the target position when recovery wrapped around the storage. | LBM1-37387 | |
userlbe : Resolved an issue where a process would terminate abruptly during shutdown instead of exiting gracefully. Note that in such cases, the system will still perform a graceful power-up. | LBM1-36844 |
Installation and Upgradeability
You can upgrade to this release from all previous Lightbits 3.12.x, 3.13.x, and 3.14.x releases.
Was this page helpful?