DMS Release Notes
1.3.0 (2025-06-10)
🎁 Features
Added support for limiting the maximum number of concurrent workflows. This limit can be configured via the
maxConcurrentWorkflowTasks
variable. Its default value is set to [10].Added support for optionally disabling the creation of new workflows via Temporal UI. This mode can be configured via the
temporal_start_workflow_disabled
variable. Its default value is set to [true
] (disabling the creation of new workflows from the UI).Added support for configuring a maximum timeout for
cloneVolume
andverifyClone
activities. This timeout can be configured via thestartToCloseTimeout
variable. Its default value is set to [15m
].DMS service health can be monitored via the following alerts:
InstanceDown
- DMS service is down/not responding to scrapes. This alert is based on theUp
metric.HealthFail
- DMS service (or one of its sub-services) is unhealthy. This alert is based on thedms_service_health_state
metric. Note that these alerts and metrics were available previously. This release fixes and improves some logic related to them and also introduces a minor name change (UpDown
->InstanceDown)
for improved consistency and clarity.
Added support for optionally specifying compression mode for the thick-clone operation. When explicitly specified, the given value would be used for the cloned volume/snapshot. When no value is specified, the compression mode used by the source snapshot will be used.
Added support for optionally specifying sector-size mode for the thick-clone operation. When explicitly specified, the given value would be used for the cloned volume/snapshot. When no value is specified, the sector-size used by the source snapshot will be used.
Upgraded temporal stack images, as well as observability images.
Upgraded the version of discovery-client to v1.19.0 included with the DMS release.
For specific guidelines on how to modify these new configuration variables (and/or existing ones), refer to the “overriding-workflow-specific-variables” and “post-deployment-variable-updated” sections in the dms-installation.md doc included in the release bundle.
🐞 Bug Fixes
- Updated the
HealthFail
recording rule to point to the correct metric. - Fixed an issue in the Health probe that erroneously always returned the SERVING value.
- Improved error handling in both the refresh and list clusters.
- Fixed issues with inconsistent use of wid/rid key+values in the log, for improved debug/automated parsing of logs.
- ansible-dms-role: backup dms.yml if updated on upgrade.
🚧 Chores
- Bumped to version 1.3.0.
- Bumped protoc-gen-go to v1.36.5.
1.2.0 (2025-02-13)
🎁 Features
- observability: Added the new dms-dashboard that exposes some useful metrics.
- Added the
dms_service_health_state
metric. - Added the
cluster_connection_state
metric. - Made optional
verifyClone
defaulted to false for thick-clone APIs. - ansible-docker-role: allowed adding linux-users to Docker group.
- Bumped temporal-ui version to temporalio/ui:2.35.0.
🐞 Bug Fixes
- Ran discovery-client with
network_mode=host
(LBM1-36509) - ansible-dms-role: unexposed ports 7233 and 5432 for enhanced security.
- Updated metrics emitted during clone/verify activities.
- Updated and exposed port for dms-worker sdk metrics.
📄 Documentation
- Updated API docs with summary and descriptions.
- Added metrics of interest to monitor error issues with the DMS stack.
- Added day-two-operations.md to elaborate on maintenance and troubleshooting of the DMS server.
- Updated dmscli.md with response types.
🚧 Chores
- Bumped version to v1.2.0.
- Enriched metrics and logs of clone/verify with the src/dst endpoint.
- Excluded multipath devices for node-exporter diskstats.
1.1.0 (2025-01-22)
🎁 Features
- Added dms.alert.rules.yml.
- Added node-exporter.alert.rules.yml.
- Added support for optionally specifying the sector size (512/4096) on the destination resource (issue: LBM1-36164).
- Added a new
grpc_go
dashboard for dms service grpc visualization. - dms.env: upgraded discovery-client to version 1.18.0.
🐞 Bug Fixes
- ansible: Fixed an issue of attaching multiple DMS services to the same cluster, due to the same pubKeyID being used across different DMS hosts (issue: LBM1-36163).
- thick-clone: Fixed an issue of thick cloning from a snapshot that originally had a sector size of 512 bytes (issue: LBM1-36164).
- metrics: Exposed metrics using http handler over /metrics url.
- prometheus.yml: Updated the reference Prometheus config to scrape the DMS service correctly (the target destination was incorrectly specified).
- docker-compose.yml.j2: Fixed an issue with the discovery-client cfg file name.
- Fixed a permission access issue with DMS dynamicconfig files.
📄 Documentation
- Documented metrics and alerts.
- Packaged proto files in the dms-docs tarball.
1.0.0 (2024-12-30)
🎁 Features
- Support of a single persistent volume for all DMS service files.
- DMSCLI simplified the attach cluster/get credentials flow, by exporting the base64 encoded public key directly to the file.
- Added support for pagination in list workflows.
- Upgraded grpc-gateway to v2.25.1.
🐞 Bug Fixes
- Adjusted lbclient timeouts to 10s for every op and add retries.
- Added verification and protections for snapshots and volumes that already existed on clusters with the same name.
- Fixed incomplete cleanup after some failed thick clone operations.
📈 Performance Improvements
- Enabled configuration of the number of concurrent workflows, changing the default from 2 to 10.
- Enabled configuration of the number of go-routines per workflow, and changed the default from max cores to the min (32, max cores).
API Changes/Product Changes
Updated lb-ansible version from v11.0.0 to v9.13.0.
Renamed the following APIs (removing the create prefix):
CreateThickCloneVolumeRequest
->ThickCloneVolumeRequest
.CreateThickCloneVolumeResponse
->ThickCloneVolumeResponse
.CreateThickCloneSnapshotRequest
->ThickCloneSnapshotRequest
.CreateThickCloneSnapshotResponse
->ThickCloneSnapshotResponse
.
Added proto buff and generated go code to the docs bundle.