DMS Release Notes

1.3.0 (2025-06-10)

🎁 Features

Added support for limiting the maximum number of concurrent workflows. This limit can be configured via the maxConcurrentWorkflowTasks variable. Its default value is set to [10].
Added support for optionally disabling the creation of new workflows via Temporal UI. This mode can be configured via the temporal_start_workflow_disabled variable. Its default value is set to [true] (disabling the creation of new workflows from the UI).
Added support for configuring a maximum timeout for cloneVolume and verifyClone activities. This timeout can be configured via the startToCloseTimeout variable. Its default value is set to [15m].
DMS service health can be monitored via the following alerts:
- InstanceDown- DMS service is down/not responding to scrapes. This alert is based on the Up metric.
- HealthFail- DMS service (or one of its sub-services) is unhealthy. This alert is based on the dms_service_health_state metric. Note that these alerts and metrics were available previously. This release fixes and improves some logic related to them and also introduces a minor name change (UpDown->InstanceDown) for improved consistency and clarity.
Added support for optionally specifying compression mode for the thick-clone operation. When explicitly specified, the given value would be used for the cloned volume/snapshot. When no value is specified, the compression mode used by the source snapshot will be used.
Added support for optionally specifying sector-size mode for the thick-clone operation. When explicitly specified, the given value would be used for the cloned volume/snapshot. When no value is specified, the sector-size used by the source snapshot will be used.
Upgraded temporal stack images, as well as observability images.
Upgraded the version of discovery-client to v1.19.0 included with the DMS release.

For specific guidelines on how to modify these new configuration variables (and/or existing ones), refer to the “overriding-workflow-specific-variables” and “post-deployment-variable-updated” sections in the dms-installation.md doc included in the release bundle.

🐞 Bug Fixes

Updated the HealthFail recording rule to point to the correct metric.
Fixed an issue in the Health probe that erroneously always returned the SERVING value.
Improved error handling in both the refresh and list clusters.
Fixed issues with inconsistent use of wid/rid key+values in the log, for improved debug/automated parsing of logs.
ansible-dms-role: backup dms.yml if updated on upgrade.

🚧 Chores

Bumped to version 1.3.0.
Bumped protoc-gen-go to v1.36.5.

1.2.0 (2025-02-13)

🎁 Features

observability: Added the new dms-dashboard that exposes some useful metrics.
Added the dms_service_health_state metric.
Added the cluster_connection_state metric.
Made optional verifyClone defaulted to false for thick-clone APIs.
ansible-docker-role: allowed adding linux-users to Docker group.
Bumped temporal-ui version to temporalio/ui:2.35.0.

🐞 Bug Fixes

Ran discovery-client with network_mode=host (LBM1-36509)
ansible-dms-role: unexposed ports 7233 and 5432 for enhanced security.
Updated metrics emitted during clone/verify activities.
Updated and exposed port for dms-worker sdk metrics.

📄 Documentation

Updated API docs with summary and descriptions.
Added metrics of interest to monitor error issues with the DMS stack.
Added day-two-operations.md to elaborate on maintenance and troubleshooting of the DMS server.
Updated dmscli.md with response types.

🚧 Chores

Bumped version to v1.2.0.
Enriched metrics and logs of clone/verify with the src/dst endpoint.
Excluded multipath devices for node-exporter diskstats.

1.1.0 (2025-01-22)

🎁 Features

Added dms.alert.rules.yml.
Added node-exporter.alert.rules.yml.
Added support for optionally specifying the sector size (512/4096) on the destination resource (issue: LBM1-36164).
Added a new grpc_go dashboard for dms service grpc visualization.
dms.env: upgraded discovery-client to version 1.18.0.

🐞 Bug Fixes

ansible: Fixed an issue of attaching multiple DMS services to the same cluster, due to the same pubKeyID being used across different DMS hosts (issue: LBM1-36163).
thick-clone: Fixed an issue of thick cloning from a snapshot that originally had a sector size of 512 bytes (issue: LBM1-36164).
metrics: Exposed metrics using http handler over /metrics url.
prometheus.yml: Updated the reference Prometheus config to scrape the DMS service correctly (the target destination was incorrectly specified).
docker-compose.yml.j2: Fixed an issue with the discovery-client cfg file name.
Fixed a permission access issue with DMS dynamicconfig files.

📄 Documentation

Documented metrics and alerts.
Packaged proto files in the dms-docs tarball.

1.0.0 (2024-12-30)

🎁 Features

Support of a single persistent volume for all DMS service files.
DMSCLI simplified the attach cluster/get credentials flow, by exporting the base64 encoded public key directly to the file.
Added support for pagination in list workflows.
Upgraded grpc-gateway to v2.25.1.

🐞 Bug Fixes

Adjusted lbclient timeouts to 10s for every op and add retries.
Added verification and protections for snapshots and volumes that already existed on clusters with the same name.
Fixed incomplete cleanup after some failed thick clone operations.

📈 Performance Improvements

Enabled configuration of the number of concurrent workflows, changing the default from 2 to 10.
Enabled configuration of the number of go-routines per workflow, and changed the default from max cores to the min (32, max cores).

API Changes/Product Changes

Updated lb-ansible version from v11.0.0 to v9.13.0.
Renamed the following APIs (removing the create prefix):
- CreateThickCloneVolumeRequest -> ThickCloneVolumeRequest.
- CreateThickCloneVolumeResponse-> ThickCloneVolumeResponse.
- CreateThickCloneSnapshotRequest-> ThickCloneSnapshotRequest.
- CreateThickCloneSnapshotResponse-> ThickCloneSnapshotResponse.
Added proto buff and generated go code to the docs bundle.

Last updated on

Was this page helpful?