Release 3.5.1

Release Date

v3.5.1 was released to the public on November 8th, 2023.

New in This Release

This release introduces the following changes since version 3.4.2:

  1. api -service: Lightbits systems originally supported internally three modes of multi-tenancy operation: "disabled," "permissive," and "enforcing," although only "enforcing" was ever deployed in the field. This release removes "disabled" and "permissive" modes entirely and leaves only "enforcing." No user changes are expected or needed.
  2. api-service: Only allow upgrading servers that are in the Active state.
  3. api-service: don't fail volume-related API calls where both name and UUID are provided. Instead, just use the UUID.
  4. lbavs: lbavs now exits with exit code 0 for success and non-zero exit status for command failure.
  5. lbavs: lbavs: Use Microsoft managed identities for authentication to Azure cloud resources.
  6. azure: Lightbits events, such as cluster capacity limit alerts, now appear in Azure Application Insights.
  7. azure: Add a new tab for Azure VMware Solution (AVS) to the Azure marketplace offer. This tab collects info which we will use to create the required resources for connection between AVS and LBCluster: public IP, GatewaySubnet, virtual gateway, expressroute connection. Note that this option is available only to users who choose to create a new VNet for the Lightbits cluster. Users who choose to use an existing VNet need to create the required resources themselves.
  8. cluster-manager: Properly handle attempts to upgrade servers that have transitioned to states other than Active.
  9. data-layer: Fixed a race condition in the snapshots cache that could, under rare circumstances, result in service disruption. When a snapshot was migrating from a source to a destination node, it was possible for the snapshot to be created on the destination node while another node remained unaware of the successful migration. This scenario could cause the unaware node to stop responding to further updates.
  10. discovery-client: Updated the Golang text module due to a potential Denial of Service (DoS) vulnerability in the module.
  11. duroslight: Fixed a race condition between read I/O accounting and client connection tracking. This race condition was most likely to occur when a connection was released under heavy read traffic. Encountering this race condition caused a decrease in the server's ability to accept future read I/Os and resulted in an assertion failure during service shutdown.
  12. duroslight: Volume access permission checks on the data path are now slightly faster.
  13. duroslight: Addressed an issue where in some rare cases, duroslight erroneously handled I/Os originating from connections that were alive when the I/O was received but were closed before the I/O could be processed. This has been resolved by verifying the connection's status immediately before processing I/O requests.
  14. exporter: Added a new metric exposing volume migration state called instance:volume:volume_migrating. Also fixed a bug that affected releases older than 3.3.2 where the instance:volume:volume_state_degraded metric would have remained asserted even after volume migration completed.
  15. lb_irq_balance.service: A new interrupt balancing service is automatically started along with gftl.service. The interrupt balancing service dynamically adjusts the NIC IRQ (Interrupt Request) core affinity based on the load reported by userlbe. In some instances, this can lead to a performance increase of over 50%. Note that the IRQ balancer must be disabled on Azure VMs.
  16. node-manager: Fixed a rare bug where a volume could become stuck in a migrating state, causing write operations to fail. This issue could only occur when a volume was rapidly migrated out of and back into one of the nodes that held its replica.
  17. upgrade-manager: In some rare cases, server UUIDs may be missing from the exporter YAML configuration file. We now gracefully recover from this situation and also clean up etcd open file descriptors in case of startup failure.
  18. upgrade-manager: Fixed a bug where a "delete server" API command issued while there is an ongoing cluster upgrade will fail the upgrade and prevent any new upgrade operations.
  19. userlbe: Addressed a rare crash that occurred when a disk went offline between write unit creation and completion.
  20. userlbe: Recovery from an abrupt shutdown is now up to 3.5 times faster.

Installation and Upgradeability

This release is upgradeable from the prior releases listed below. An (x) indicates upgradeability from the listed release. An empty bracket “( )” indicates that this release is not directly upgradeable from that particular release and upgrading requires an upgrade to an intermediary release (e.g., 3.3.2 to 3.4.2 to 3.5.1) or a new installation.

( ) Lightbits 2.2.x

( ) Lightbits 2.3.x

( ) Lightbits 3.0.x

( ) Lightbits 3.1.1

( ) Lightbits 3.1.2

( ) Lightbits 3.2.1

( ) Lightbits 3.3.1

( ) Lightbits 3.3.2

(x) Lightbits 3.4.1

(x) Lightbits 3.4.2

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard