The new 4.8.4 SynetoOS release brings an important change to the SynetoOS replication architecture.
Until the release 4.7.7, the recovery point creation and the datastore replication processes were tightly interlinked - replication would start immediately after the snapshot.
This architecture has served it’s purpose well, but was limited.
Starting with the 4.8.3 release, the replication processes are separate from the recovery point creation and are orchestrated independently. Every time a recovery point is created, if the protection policy has a replication target defined, the process creates a replication task.
The replication task is managed by a new replication manager service.
Every 3 minutes, replications are started from the existing queue.
This new architecture has allowed Syneto to add two new improvements: automatic replication retries and forever incremental replication.
Replication - Automatic Retries
A failed replication is retried every three minutes for one hour - 20 retries. An email is sent after one hour if replication has continuously failed.
Replication - Forever Incremental
If a replication fails after having transfered 34%, when it restarts, it starts transfering data from where it left off - at 34%. Subsequent replication retries continuously transfer data incrementally. This approach ensures that data is continuously replicated even across unstable links.
In previous versions, the replication started right after the snapshot. How will it be on v4.8.3 and onwards?
It can take up to 3minutes to start a replication.
In previous versions if you had a minute-by-minute and hourly snapshot in the same time, it would start 2 replications and one would fail. The new architecture fixes this shortcoming.
If a datastore replication is still ongoing to a target, it will not start another replication to the same target. Another check in 3min.
In previous versions, if a replication would fail we would retry only when the next snapshot was made. How will it be on v4.8.3 and onwards?
This was a problem especially for daily & weekly schedules. Now the system retries in max 3min after a replication failed.
What if the client is out of space. Does it still retry replicating? No. The system stops and sends an alert email. Replication will be retried after the next recovery point.
In previous versions, if a replication would fail we would send an email right away.
Because of all the automatic retries, sending an email for every failure is too much. If for 1hour (20 tries) there is no progress, we will send an email.
If replications keep failing, but are still transferring data after each replication failure, no alert email will be sent.
In previous versions, I could see info about the replication in the snapshot history. How will it be on v4.8.3 and onwards?
Initially we removed this as snapshot process and replication are seperate process. But after the feedback from beta we reintroduced this. Should be as before.
In previous versions, the user was able to select exclusion times. How will it be on v4.8.3 and onwards?
The user will still be able to select exclusion times. In that time we will not retry failed replications.
What about ongoing replications? If a replication starts before the exclusion time, it will not stop. It will go until it finishes or fails.