Skip to content

Architecture

Scaling

How Datamotive scales management, replication, and recovery using horizontal node expansion and adaptive flow control.

Product
Datamotive Platform
Version
v1.0
Documentation status
Published
Last updated
Updated
Reading time
2 min read

Datamotive scales horizontally across the data plane and independently scales management, replication, and recovery orchestration. Add nodes when a single node cannot sustain the required RPO, replication throughput, or recovery concurrency.

Scalability model

Datamotive scalability is achieved through:

  • Parallel replication streams
  • Sliding-window replication
  • Vertical or horizontal replication-node expansion
  • Distributed recovery orchestration

Small environments can operate with minimal infrastructure. Large environments can add Replication Nodes to distribute data movement and recovery workloads. Recovery throughput scales independently from management operations, and replication throughput scales independently from orchestration operations.

Data plane scaling

Add Replication Nodes

Add Replication Nodes when cycle duration approaches or exceeds the configured RPO, node CPU or memory pressure stays elevated, or cloud-write latency increases under concurrency.

Each node operates independently while the control plane aggregates status across the site.

Environment sizeRecommended Replication Nodes
Fewer than 50 workloads1
50 to 300 workloads2 to 4
300 to 1000 workloads4 to 8
More than 1000 workloads8+

Horizontal scaling provides increased aggregate throughput, better API distribution, improved recovery concurrency, reduced node-level contention, and better operational isolation.

Platform-specific scaling

VMware environments are commonly constrained by CBT fragmentation, snapshot performance, VDDK read concurrency, datastore latency, and WAN throughput. AWS and Azure environments are often constrained by cloud API throughput, snapshot or disk operation concurrency, instance network bandwidth, and regional quota behavior.

Control plane scaling

The Management Server handles UI, API, CLI, orchestration, scheduling, monitoring, reporting, inventory, and metadata workflows. Increase Management Server capacity as protected workload count, recovery orchestration concurrency, reporting retention, API activity, and multi-site deployment scale grow.

Environment scalevCPUMemoryStorage
Very small4 vCPU8 GB120 GB SSD
Small4 vCPU16 GB200 GB SSD
Medium8 vCPU24 GB500 GB SSD
Large16 vCPU32 GB1 TB SSD
Very large32 vCPU64 GB2 TB SSD

Recovery scaling

Recovery orchestration is parallelized but must respect cloud API limits, quota constraints, disk provisioning concurrency, storage write behavior, and network provisioning limits.

For Windows workloads, Prep Nodes provide recovery preparation and boot remediation. A Prep Node supports up to 20 parallel Windows recoveries; add Prep Nodes horizontally for larger recovery events.

Scale signals

Add capacity or reduce concurrency when you observe:

  • Replication cycle duration consistently exceeding 80 percent of the RPO interval.
  • CPU or memory pressure on a Replication Node during replication windows.
  • Cloud storage write latency increasing under parallel replication.
  • API throttling responses during replication or recovery.
  • Job queue depth growing over time.
  • Recovery batches waiting on VM, disk, NIC, or IP quota.

Related docs

Was this page helpful?