4.3. Upgrading the Cluster

Documentation

Home » Documentation » Administrator's Guide

4.3. Upgrading the Cluster

Sometimes you need to update or reconfigure the server infrastructure on which the VoltDB database is running. Server upgrades are one example. A server upgrade is when you need to fix or replace hardware, update the operating system, or otherwise modify the underlying system. Server upgrades usually require stopping the VoltDB database process on the specific server being serviced.

Another example is when you want to reconfigure the cluster as a whole. Reasons for reconfiguring the cluster are because you want to add or remove servers from the cluster or you need to modify the number of partitions per server that VoltDB uses.

Adding servers to the cluster can happen without stopping the database. This is called elastic scaling. Removing servers or changing the number of sites per host requires restarting the cluster during a maintenance window.

The following sections describe three cases of cluster upgrades:

  • Performing server upgrades

  • Adding servers to a running cluster through elastic scaling

  • Reconfiguring the cluster with a maintenance window

4.3.1. Performing Server Upgrades

If you need to upgrade or replace the hardware or software (such as the operating system) of the individual servers, this can be done without taking down the database as a whole. As long as the server is running with a K-safety value of one or more, it is possible to take a server out of the cluster without stopping the database. You can then fix the server hardware, upgrade software (other than VoltDB), even replace the server entirely with a new server, then bring the server back into the cluster.

To perform a server upgrade:

  1. Stop the VoltDB server process on the server. As long as the cluster is K-safe, the rest of the cluster will continue running.

  2. Perform the necessary upgrades.

  3. Have the server rejoin the cluster using the voltdb rejoin command.

The rejoin command starts the database process on the server, contacts the database cluster, then copies the necessary partition content from other cluster nodes so the server can then participate as a full member of the cluster, While the server is rejoining, the other database servers remain accessible and actively process queries from client applications.

When rejoining a cluster you must specify a host server that the rejoining node will connect to. The host can be any server still in the cluster; it does not have to be the same host specified when the cluster was initially started. For example:

$ voltdb rejoin --host=voltsvr4 \
         --deployment=deployment.xml \
         --license=~/license.xml

Note that you do not need to specify the application catalog. It is downloaded from the other cluster nodes as part of the rejoin operation.

If you need to upgrade all of the servers in the cluster (for example, if you are upgrading the operating system), the easiest method is to upgrade the servers one at a time, taking each server out of the cluster, upgrading it, then rejoining it to the cluster. This way the entire cluster can be upgraded without losing any availability to the database.

If the cluster is not K-safe — that is, the K-safety value is 0 — then you must follow the instructions in Section 4.3.3, “Reconfiguring the Cluster During a Maintenance Window” to upgrade the servers.

4.3.2. Adding Servers to a Running Cluster with Elastic Scaling

If you want to add servers to a VoltDB cluster — usually to increase performance and/or capacity — you can do this without having to restart the database. You add servers to the cluster with the voltdb add command, specifying one of the existing nodes with the --host flag. For example:

$ voltdb add --host=voltsvr4 \
         --license=~/license.xml

You must add a full complement of servers to match the K-safety value (K+1) before the servers can participate in the cluster. For example, if the K-safety value is 2, you must add 3 servers before they actually become part of the cluster and the cluster rebalances its partitions.

When you add servers to a VoltDB database, the cluster performs the following actions:

  1. The new servers are added to the cluster configuration and sent copies of the application catalog and deployment file.

  2. Once sufficient servers are added, copies of all replicated tables and their share of the partitioned tables are sent to the new servers.

  3. As the data is rebalanced, the new servers begin processing transactions for the partition content they have received.

  4. Once rebalancing is complete, the new servers are full members of the cluster.

4.3.3. Reconfiguring the Cluster During a Maintenance Window

If you want to remove servers from the cluster permanently (as opposed to temporarily removing them for maintenance as described in Section 4.3, “Upgrading the Cluster”) or you want to change other cluster-wide attributes, such as the number of partitions per server, you need to restart the server. Stopping the database temporarily to perform this sort of reconfiguration is known as a maintenance window.

The steps for reconfiguring the cluster with a maintenance window are:

  1. Place the database in admin mode (voltadmin pause).

  2. Perform a manual snapshot of the database (voltadmin save).

  3. Shutdown the database (voltadmin shutdown).

  4. Make the necessary changes to the deployment file.

  5. Start a new database using the voltdb create option, the existing catalog, and the edited deployment file.

  6. Restore the snapshot created in Step #2 (voltadmin restore).

  7. Return the database to normal operations (voltadmin resume).