Skip to content
Datamotive
Easy Hybrid DRv2.3.1GARelease notes →

Best practices

Performance

Optimize replication throughput, flow control, horizontal scale, and recovery concurrency for Easy Hybrid DR.

Product
Easy Hybrid DR
Version
v2.3.1
Release status
GA
Documentation status
Published
Last updated
Updated
Reading time
1 min read

Easy Hybrid DR performance depends on the full replication pipeline, not only link bandwidth. Source reads, changed-block processing, network transfer, validation, cloud-native storage writes, and checkpoint finalization all contribute to effective throughput.

Replication flow control model

Replication flow control model showing read time, transfer time, validation time, write time, and adaptive flow control
The engine adjusts inflight windows and parallel streams based on pressure across the end-to-end pipeline.

The replication engine adapts to:

  • Available WAN bandwidth
  • Source-side storage latency
  • CBT fragmentation characteristics
  • Cloud storage write latency
  • Parallel replication concurrency
  • Cloud API responsiveness

This helps maintain stable throughput while preventing network underutilization, storage-write overload, replication stalls, excessive inflight memory growth, and cloud API saturation.

Performance test scenario

The following internal validation scenario demonstrates the shape of a large parallel replication test.

ParameterValue
Source platformAzure
Target platformAWS
Replication topologyAzure to AWS
Replication Nodes1
Protected VMs40
Protected disks80
Replication interval20 minutes
Average changed data per disk700 to 800 MB
WAN bandwidth500 Mbps, burstable to 700 Mbps
Observed sustained throughputApproximately 420 Mbps

The engine maintained approximately 70 percent of practical WAN capacity while handling parallel disk replication, fragmented CBT workloads, multiple simultaneous VM streams, and continuous cloud-native storage writes.

Tune for the bottleneck

SymptomLikely bottleneckAction
Replication cycle duration exceeds the RPO intervalInsufficient node, network, or target write capacityAdd Replication Nodes, increase bandwidth, or reduce workload concurrency per plan.
WAN is underutilizedSource read pressure, fragmented CBT, or low parallelismCheck source storage latency, CBT fragmentation, and parallel disk assignment.
Replication stalls during target writesCloud storage latency or API throttlingReview cloud throttling metrics and distribute workloads across additional nodes.
Node memory pressure increases during replicationInflight windows are too large for current write conditionsReduce concurrency or split workload groups across nodes.
Recovery jobs queue during failoverCloud quota, API, or Prep Node concurrency limitsStage recovery by protection plan and prepare quota increases before the event.

Performance best practices

  • Group workloads by RPO, churn, and source platform behavior.
  • Stagger replication schedules for large environments.
  • Add Replication Nodes before cycle duration consistently reaches the RPO interval.
  • Keep high-churn database and VDI workloads in smaller workload groups.
  • Monitor source storage latency, cloud write latency, retry rates, and API throttling.
  • Validate quota readiness before production onboarding and before large DR drills.
  • Use staged recovery orchestration for massive failovers.

Recovery concurrency

Parallel recovery depends on attached disk capacity for Prep Nodes, cloud API throttling, compute instance quotas, and network provisioning concurrency.

For scaled recoveries, use higher-capacity Prep Node configurations when more than the default attached disk count is required. Plan recoveries in protection-plan batches and pre-validate target cloud compute quotas.

Related docs

Was this page helpful?