Release 3.13.1

Release Date

v3.13.1 was released to the public on February 12, 2025.

New in This Release

This release introduces the following changes since version 3.12.x:

ItemDescriptionID
1New feature [generally available]: Thick Clones via Data Mobility Service (DMS). Lightbits' Data Mobility Service (DMS) is a standalone service running in a Docker container. It runs on dedicated resources outside a Lightbits cluster that manage and orchestrate the mobility of volumes and snapshots within the attached Lightbits clusters. The DMS currently provides the ability to create thick clones of Lightbits volumes, with more capabilities to come in future releases. In this release, volume cloning is fully supported within the cluster, while cross-cluster cloning and copying are introduced as a technology preview.LBM1-34808
2api-service: Added a new attribute to the server's info output as part of the get/list servers API. The new attribute is called IsTpm2Supported. This indicates whether the server supports using its Trusted Platform Module 2.0 (TPM2) - if one exists. If information about a server's TPM2 support is unavailable, Lightbits considers it as not supported.LBM1-35114
3api-service: Returned an error to the client when enabling cluster encryption with TPM, but when one or more of the servers in the cluster did not support TPM 2.0.LBM1-35117
4cluster-manager: Added missing cleanup flow for failed volume creations. Without the fix, failed volume creations could lead the relevant cluster-manager to perform unbalanced placements of volumes in the future.LBM1-35512
5data-layer: Fixed incorrect handling of invalid values in control-plane monitoring. In unexpected cases in which a control-plane key-value object could mistakenly be stored with an invalid value, we would stop monitoring for specific key/prefix changes, which could cause the control plane to malfunction.LBM1-35501
6data-layer: Fixed an issue that could cause a node to hang when etcd becomes unresponsive.LBM1-35894
7data-layer: Fixed incorrect error handling when watching for changes in the control plain. Due to a missing retry mechanism, in some rare cases, we could potentially completely stop watching for changes - without raising any significant user event or attempting to restart the watch. This could potentially lead to service loss. Upgrading from 3.10.x and 3.11.x is highly recommended. Upgrading from 3.12.1 is recommended if you plan to move to 3.13.1; otherwise, we recommend waiting for 3.12.2, which will be the next LTS release.LBM1-36090
8duroslight: Fixed a bug that could cause slightly longer than needed rebuilds due to missed synchronization points - causing us to rebuild based on an earlier synchronization point.LBM1-35565
9duroslight: Fixed missing rebuild data with TRIM enabled, if during rebuild a client trims LBA X and shortly writes data to LBA X. In such a scenario, there is a probability that the secondary node being rebuilt will keep LBA x trimmed - thus missing the data written to it after TRIM.LBM1-35897
10duroslight: Removed ADQ support.LBM1-29193
11events: Added the ServerEncryption event type and added support for both ServerEncryption and ClusterEncryption events in EventsManager, ensuring accurate ComponentInfo details (name and ID).LBM1-35976
12events: Added events for EnableClusterEncryptionprocess initiated and process completed successfully.LBM1-35657
13lb-support: Did not bring kdumpinformation if kdump was not used on the system.LBM1-35753
14lbcli: Removed the DevelopTrim feature flag from the API and CLI. This flag is no longer needed, due to TRIM being in Alpha and then to tech preview and GA.LBM1-35566
15node-manager: Improved the node-manager startup flow. Added previously missing events to different failure flows, and retry-mechanisms for transient network issues when loading different data-layer objects.LBM1-35749
16node-manager: Made TPM encryption handling more robust in the face of reusing servers in different clusters, by not relying on existing TPM sub-keys for encryption.LBM1-36027
17node-manager: Added TPM2 availability information to the server details during the server registration process.LBM1-35113
18node-manager: Fixed a bug that caused a node to remain inactive if encryption was enabled using TPM, but then the TPM was mistakenly cleared on the node's server (and then the node-manager service was restarted).LBM1-36044
19node-manager: Fixed an issue where a failure while starting to run the internal services of the node-manager could cause the node to appear as active, when it is actually inactive.LBM1-35575
20node-manager: Fixed a bug that caused the node ID to appear in events when enabling encryption, instead of the server ID.LBM1-35755
21node-manager: Fixed a bug in which two consecutive node-level updates, one of which is a node failure, lead to a race condition that could make a node perform an unneeded full rebuild, instead of a partial (and much quicker) rebuild.LBM1-35633
22node-manager: Fixed a deadlock in the service's startup flow. If encryption was enabled on both the cluster and the server, but the local file containing relevant encryption information was mistakenly deleted from the server - upon the next node-manager startup, we would have gotten into a deadlock, preventing the node from completing its activation flow.LBM1-35853
23node-manager: Fixed an issue that can lead to a node becoming inactive under certain circumstances, where TPM access completely jams the go runtime scheduler, preventing any other go-routines spawned by the running service from running.LBM1-35340
24node-manager: Prevented the possibility of rejoining a cluster after a server has been deleted from the cluster.LBM1-35739
25node-manager: Validated server compatibility with TPM 2.0 when encryption is enabled and the key store is set to TPM. If unsupported, node-manager transitions to an inactive state and emits a ServerStartupFailed (303) event with cause TPM2NotSupported (1305).LBM1-34672
26packages/backend: The phase of TRIM object processing is now reflected in the powerup progress.LBM1-35611
27packages/idp: Introduced rate limiting for requests to the ADFS Identity Provider, allowing up to three requests per second to prevent DoS attacks.LBM1-35374
28profile-generator: Fixed a potential hang on Alma9 kernels related to single-instance multi-numa deployment. The fix enables node-manager to reserve memory via huge pages and free the memory once GFTL completes allocating most of the system RAM – ensuring that every node has minimal RAM space left to avoid hangs due to endless node-reclaim on Alma 9 kernels. The default extraReservedRAM = 50% of the OS reserved (currently this is 4 GB).LBM1-35837
29upgrade-manager: Fixed an issue that caused the cluster last upgrade status to remain "Upgrading" instead of "Failed", in case there was a failure in the cluster upgrade process.LBM1-35663
30userlbe: Resolved a rare issue where SSDs failing to return write completions for an extended period could cause the GFTL to crash, resulting in the node becoming inactive.LBM1-35598
31userlbe: Fixed a bug where - under some circumstances for graceful shutdown and recovery - Lightbits would use one core less than what was possible. This fix should slightly speed up graceful shutdown/recovery under those circumstances.LBM1-35733
32userlbe: Fixed an issue where the sorting algorithm used was not configured properly for large lists sorted during powerup, which caused graceful powerup to take significantly longer than it should.LBM1-35734
33userlbe: Fixed an issue where latency metrics were all bunched together in the smallest histogram bucket, making it harder to debug disk latency issues. Metrics are now correctly divided into buckets.LBM1-35882

Installation and Upgradeability

You can upgrade to this release from all previous Lightbits 3.10.x, 3.11.x, and 3.12.x releases.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard