VoltDB WAN Replication – Pre-release Users Invited

written by John Piekos on January 26, 2012 with no comments

The subject of my last blog post was an introduction to VoltDB Replication, one of the major product additions that we’ve been working on.  Well, with this post, I’m happy to announce that we’ve recently pre-released VoltDB Replication to selected users to perform beta-level validation!

VoltDB WAN replication involves duplicating the contents of one database cluster (known as the master) to another database cluster (known as the replica). The process of retrieving completed transactions from the master and applying them to the replica is managed by a separate process called the Disaster Recovery (DR) agent.

The DR agent is critical to the WAN replication process. It performs the following tasks:

  • Initiates the replication, telling the master database to start queuing completed transactions and establishing a special client connection to the replica.
  • POLLs and ACKs the completed transactions from the master database and recreates the transactions on the replica.
  • Monitors the replication process, detects possible errors in the replica or delays in synchronizing the two clusters, and — when necessary — reports error conditions and cancels replication.

Note that the DR agent can be located anywhere. However, the replication process is optimized for the DR agent to be co-located with the replica database.

Communication between the DR agent and the master database is kept small to avoid bottlenecks. Only write transactions are replicated and the messages between the master and the agent are compressed. Whereas the DR agent sends transactions to the replica using standard client invocations. Therefore, locating the DR agent near the replica is recommended.

VoltDB WAN replication runs silently in the background, providing security against unexpected disruptions. The replication process is designed to withstand normal operational glitches, such as WAN latency and minor short-term network connectivity issues.  Both the master database and the DR agent maintain queues to handle fluctuations in the transmission of transactions.

Network hiccups or a sudden increase of load on the master database can cause delays. Nodes on the master cluster may fail and rejoin (assuming K-safety). The queues in both the Master and DR Agent help the replication process survive such interruptions.  Further, VoltDB provides valuable statistics and logging that enable you to monitor these situations and react accordingly.

If you’d like to participate in the VoltDB Replication pre-release program, please drop me an email at jpiekos@voltdb.com.   If not, be sure to keep your eyes open for general availability of this new feature in early Spring 2012!