There are times when it is necessary to save the contents of a VoltDB database to disk and then restore it. For example, if the cluster needs to be shut down for maintenance, you may want to save the current state of the database before shutting down the cluster and then restore the database once the cluster comes back online. Performing periodic backups of the data can also provide a fallback in case of unexpected failures — either physical failures, such as power outages, or logic errors where a client application mistakenly corrupts the database contents.
VoltDB provides shell commands, system procedures, and an automated snapshot feature that help you perform these operations. The following sections explain how to save and restore a running VoltDB cluster, either manually or automatically.
Manually saving and restoring a VoltDB database is useful when you need to do maintenance on the database itself or the cluster it runs on. For example, if you need to upgrade the hardware or add a new node to the cluster. The normal use of save and restore, when performing such a maintenance operation, is as follows:
Stop database activities (using pause).
Use save to write a snapshot of the current data to disk.
Shutdown the cluster.
Make changes to the VoltDB catalog and/or deployment file (if desired).
Restart the cluster in admin mode.
Restore the previous snapshot.
Restart client activity (using resume).
The key is to make sure that all database activity is stopped before the save and shutdown are performed. This ensures that no further changes to the database are made (and therefore lost) after the save and before the shutdown. Similarly, it is important that no client activity starts until the database has started and the restore operation completes.
Save and restore operations are performed either by calling VoltDB system procedures or using the corresponding voltadmin shell commands. In most cases, the shell commands are simpler since they do not require program code to use. Therefore, this chapter uses voltadmin commands in the examples. If you are interested in programming the save and restore procedures, see Appendix F, System Procedures for more information about the corresponding system procedures.
If you are using the VoltDB Enterprise Edition, you can also use the Enterprise Manager to perform many of these tasks from within the management console. See the VoltDB Management Guide for details.
When you issue a save command, you specify a path where the data will be saved and a unique identifier for tagging the files. VoltDB then saves the current data on each node of the cluster to a set of files at the specified location (using the unique identifier as a prefix to the file names). This set of files is referred to as a snapshot, since it contains a complete record of the database for a given point in time (when the save operation was performed).
--blocking option lets you specify whether the save operation should block other transactions
until it completes. In the case of manual saves, it is a good idea to use this option since you do not want additional
changes made to the database during the save operation.
Note that every node in the cluster uses the same absolute path, so the path specified must be valid, must exist on every node, and must not already contain data from any previous saves using the same unique identifier, or the save will fail.
When you issue a restore command, you specify the same absolute path and unique identifier used when creating the snapshot. VoltDB checks to make sure the appropriate save set exists on each node, then restores the data into memory.
To save the contents of a VoltDB database, use the voltadmin save command. The following example creates a snapshot at the path /tmp/voltdb/backup using the unique identifier TestSnapshot.
$ voltadmin save --blocking /tmp/voltdb/backup "TestSnapshot"
In this example, the command tells the save operation to block all other transactions until it completes. It is possible to save the contents without blocking other transactions (which is what automated snapshots do). However, when performing a manual save prior to shutting down, it is normal to block other transactions to ensure you save a known state of the database.
Note that it is possible for the save operation to succeed on some nodes of the cluster and not others. When you issue the voltadmin save command, VoltDB displays messages from each partition indicating the status of the save operation. If there are any issues that would stop the process from starting, such as a bad file path, they are displayed on the console. It is a good practice to examine these messages to make sure all partitions are saved as expected.
To restore a VoltDB database from a snapshot previously created by a save operation, you use the voltadmin restore command. You must specify the same pathname and unique identifier used during the save.
The following example restores the snapshot created by the example in Section 9.1.1.
$ voltadmin restore /tmp/voltdb/backup "TestSnapshot"
As with save operations, it is always a good idea to check the status information displayed by the command to ensure the operation completed as expected.
Between a save and a restore, it is possible to make selected changes to the database. You can:
Add nodes to the cluster
Modify the database schema
Add, remove, or modify stored procedures
To make these changes, you must, as appropriate, edit the database schema, the procedure source files, or the deployment file. You can then recompile the application catalog and distribute the updated catalog and deployment file to the cluster nodes before restarting the cluster and performing the restore.
To add nodes to the cluster, use the following procedure:
Save the database.
Edit the deployment file, specifying the new number of nodes in the hostcount attribute of the <cluster> tag.
Restart the cluster (including the new nodes).
Issue a restore command.
When the snapshot is restored, the database (and partitions) are redistributed over the new cluster configuration.
It is also possible to remove nodes from the cluster using this procedure. However, to make sure that no data is lost in the process, you must copy the snapshot files from the nodes that are being removed to one of the nodes that is remaining in the cluster. This way, the restore operation can find and restore the data from partitions on the missing nodes.
To modify the database schema or stored procedures, make the appropriate changes to the source files (that is, the database DDL and the stored procedure Java source files), then recompile the application catalog. However, you can only make certain modifications to the database schema. Specifically, you can:
Add or remove tables.
Add or remove columns from tables.
Change the datatypes of columns, assuming the two datatypes are compatible. (That is, the data can be converted from the old to the new type. For example, extending the length of VARCHAR columns or converting between two numeric datatypes.)
Note that you cannot rename tables or columns and retain the data. If you rename a table or column, it is equivalent to deleting the original table/column (and its data) and adding a new one. Two other important points to note when modifying the database structure are:
When existing rows are restored to tables where new columns have been added, the new columns are filled with either the default value (if defined by the schema) or nulls.
When changing the datatypes of columns, it is possible to decrease the datatype size (for example, going from an INT to an TINYINT). However, if any existing values exceed the capacity of the new datatype (such as an integer value of 5,000 where the datatype has been changed to TINYINT), the entire restore will fail.
If you remove or modify stored procedures (particularly if you change the number and/or datatype of the parameters), you must make sure the corresponding changes are made to all client applications as well.