Release 3.10.1

Release Date

v3.10.1 was released to the public on August 14, 2024.

New in This Release

This release introduces the following changes since version 3.9.x:

  1. api-service: Enhanced the disable server API, to optionally provide an override for the permanent failure threshold timeout. This enables setting a dedicated permanent failure timeout for servers that were taken offline for maintenance purposes.
  2. api-service: Exposed node metadata RAM utilization in the API.
  3. api-service: Added verification that the project does not have any associated resources (snapshots) before its deletion.
  4. backend: Fixed a bug introduced in 3.9.1, where a node could become Inactive if the following sequence of events occurred:
  • A snapshot was taken of a volume.
  • No new data was written to the volume.
  • The above snapshot was deleted.
  • The node experienced an abrupt shutdown.
  1. backend: Fixed a possible stability issue following an error detected in one of the NVMe SSD devices.
  2. cluster-manager: Fixed an issue that could cause a volume to remain in Migrating state for an extended duration.
  3. cluster-manager: Fixed an issue where control operations could be unavailable, following the loss of an active cluster manager. For this issue to occur, a cluster manager must first experience some other error, such as a network or load issue that will coincide exactly with a very specific and extremely small timeframe, for updating internal resources with information on the migration completion.
  4. data-layer: Fixed a bug in handling deletions of snapshots, that under rare conditions could lead to rebuilds not being started, and volumes being stuck in Degraded/Read-Only protection states.
  5. discovery-service: Fixed a potential loss of discovery-service during reconnection, following an initial failure to connect.
  6. duroslight: Fixed a potential issue that could cause duroslight slowness/unresponsiveness, which would in turn lead to a node becoming inactive. The fix for this was an update of the code to use mlockall (--lock-memory option) to keep duroslight pages from eviction out of page cache. This prevented an abnormal kernel behavior observed in el9_3 kernels that kept evicting R/O pages and made the process very slow, due to constant major page faults that re-read from the executable file.
  7. duroslight: Optimized the rollback of recovered ranges for the TRIM feature.
  8. duroslight: Fixed a rare race that could occur during a series of consecutive starts of rebuild and aborted rebuild (i.e., when a node jitters between Active/Inactive states or in and out of Read Only state). This issue could result in the affected volumes remaining in Degraded state.
  9. events: Added an event for a node entering Unattached state (all volumes and snapshots were migrated off of this node).
  10. general: Ended Lightbits support of CentOS 7.
  11. monitoring stack: Added QoS Delay metrics to the Front-End (FE) and monitoring stack, shown in the Grafana Volume Performance dashboard.
  12. monitoring-stack: Fixed display units on Front-End (FE) Read and Write Latency panels in the server and volume dashboards, to correctly show microseconds by default.
  13. monitoring-stack: Fixed missing smart statistics (critical|warning)_comp_temperature_time.
  14. node-manager: Enhanced auto-revive support to attempt a graceful shutdown of the backend service, once auto-revive logic was triggered (auto-revive logic detected unresponsive service, or fatal error in service).
  15. node-manager: Fixed an issue in calculation and allocation of memory space for Lightbits services. Node-manager now takes the page-table overhead (0.2%) into account when calculating the free RAM available. This is to prevent this overhead from depleting the systemReservedRam specified in the system-profile.yaml on systems with large RAM, where 0.2% is the same order of magnitude as systemReservedRam.
  16. node-manager/cluster-manager: Fixed an issue where due to some rare internal races and internal data inconsistencies, services logs could be flooded.
  17. upgrade manager: Fixed an issue where an upgrade cluster failure might have been reported erroneously.

Installation and Upgradeability

You can upgrade to this release from all previous Lightbits v3.7.x, 3.8.x, and 3.9.x releases.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard