DMS Release Notes

1.3.0 (2025-06-10)

🎁 Features

  • Added support for limiting the maximum number of concurrent workflows. This limit can be configured via the maxConcurrentWorkflowTasks variable. Its default value is set to [10].

  • Added support for optionally disabling the creation of new workflows via Temporal UI. This mode can be configured via the temporal_start_workflow_disabled variable. Its default value is set to [true] (disabling the creation of new workflows from the UI).

  • Added support for configuring a maximum timeout for cloneVolume and verifyClone activities. This timeout can be configured via the startToCloseTimeout variable. Its default value is set to [15m].

  • DMS service health can be monitored via the following alerts:

    • InstanceDown- DMS service is down/not responding to scrapes. This alert is based on the Up metric.
    • HealthFail- DMS service (or one of its sub-services) is unhealthy. This alert is based on the dms_service_health_state metric. Note that these alerts and metrics were available previously. This release fixes and improves some logic related to them and also introduces a minor name change (UpDown->InstanceDown) for improved consistency and clarity.
  • Added support for optionally specifying compression mode for the thick-clone operation. When explicitly specified, the given value would be used for the cloned volume/snapshot. When no value is specified, the compression mode used by the source snapshot will be used.

  • Added support for optionally specifying sector-size mode for the thick-clone operation. When explicitly specified, the given value would be used for the cloned volume/snapshot. When no value is specified, the sector-size used by the source snapshot will be used.

  • Upgraded temporal stack images, as well as observability images.

  • Upgraded the version of discovery-client to v1.19.0 included with the DMS release.

For specific guidelines on how to modify these new configuration variables (and/or existing ones), refer to the “overriding-workflow-specific-variables” and “post-deployment-variable-updated” sections in the dms-installation.md doc included in the release bundle.

🐞 Bug Fixes

  • Updated the HealthFail recording rule to point to the correct metric.
  • Fixed an issue in the Health probe that erroneously always returned the SERVING value.
  • Improved error handling in both the refresh and list clusters.
  • Fixed issues with inconsistent use of wid/rid key+values in the log, for improved debug/automated parsing of logs.
  • ansible-dms-role: backup dms.yml if updated on upgrade.

🚧 Chores

  • Bumped to version 1.3.0.
  • Bumped protoc-gen-go to v1.36.5.

1.2.0 (2025-02-13)

🎁 Features

  • observability: Added the new dms-dashboard that exposes some useful metrics.
  • Added the dms_service_health_state metric.
  • Added the cluster_connection_state metric.
  • Made optional verifyClone defaulted to false for thick-clone APIs.
  • ansible-docker-role: allowed adding linux-users to Docker group.
  • Bumped temporal-ui version to temporalio/ui:2.35.0.

🐞 Bug Fixes

  • Ran discovery-client with network_mode=host (LBM1-36509)
  • ansible-dms-role: unexposed ports 7233 and 5432 for enhanced security.
  • Updated metrics emitted during clone/verify activities.
  • Updated and exposed port for dms-worker sdk metrics.

📄 Documentation

  • Updated API docs with summary and descriptions.
  • Added metrics of interest to monitor error issues with the DMS stack.
  • Added day-two-operations.md to elaborate on maintenance and troubleshooting of the DMS server.
  • Updated dmscli.md with response types.

🚧 Chores

  • Bumped version to v1.2.0.
  • Enriched metrics and logs of clone/verify with the src/dst endpoint.
  • Excluded multipath devices for node-exporter diskstats.

1.1.0 (2025-01-22)

🎁 Features

  • Added dms.alert.rules.yml.
  • Added node-exporter.alert.rules.yml.
  • Added support for optionally specifying the sector size (512/4096) on the destination resource (issue: LBM1-36164).
  • Added a new grpc_go dashboard for dms service grpc visualization.
  • dms.env: upgraded discovery-client to version 1.18.0.

🐞 Bug Fixes

  • ansible: Fixed an issue of attaching multiple DMS services to the same cluster, due to the same pubKeyID being used across different DMS hosts (issue: LBM1-36163).
  • thick-clone: Fixed an issue of thick cloning from a snapshot that originally had a sector size of 512 bytes (issue: LBM1-36164).
  • metrics: Exposed metrics using http handler over /metrics url.
  • prometheus.yml: Updated the reference Prometheus config to scrape the DMS service correctly (the target destination was incorrectly specified).
  • docker-compose.yml.j2: Fixed an issue with the discovery-client cfg file name.
  • Fixed a permission access issue with DMS dynamicconfig files.

📄 Documentation

  • Documented metrics and alerts.
  • Packaged proto files in the dms-docs tarball.

1.0.0 (2024-12-30)

🎁 Features

  • Support of a single persistent volume for all DMS service files.
  • DMSCLI simplified the attach cluster/get credentials flow, by exporting the base64 encoded public key directly to the file.
  • Added support for pagination in list workflows.
  • Upgraded grpc-gateway to v2.25.1.

🐞 Bug Fixes

  • Adjusted lbclient timeouts to 10s for every op and add retries.
  • Added verification and protections for snapshots and volumes that already existed on clusters with the same name.
  • Fixed incomplete cleanup after some failed thick clone operations.

📈 Performance Improvements

  • Enabled configuration of the number of concurrent workflows, changing the default from 2 to 10.
  • Enabled configuration of the number of go-routines per workflow, and changed the default from max cores to the min (32, max cores).

API Changes/Product Changes

  • Updated lb-ansible version from v11.0.0 to v9.13.0.

  • Renamed the following APIs (removing the create prefix):

    • CreateThickCloneVolumeRequest -> ThickCloneVolumeRequest.
    • CreateThickCloneVolumeResponse-> ThickCloneVolumeResponse.
    • CreateThickCloneSnapshotRequest-> ThickCloneSnapshotRequest.
    • CreateThickCloneSnapshotResponse-> ThickCloneSnapshotResponse.
  • Added proto buff and generated go code to the docs bundle.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard