Release Notes


Product

VoltDB Community Edition

Version

3.3

Release Date

May 22. 2013

This document provides information about known issues and limitations to the current release of the VoltDB Community Edition. If you encounter any problems not listed below, please be sure to report them using the VoltDB forums at http://forum.voltdb.com/. Thank you.

Important Base Platform Considerations

The recommended platform for production use of VoltDB is CentOS 5.8 or later, CentOS 6.3 or later, Ubuntu 10.4 or 12.4, and Sun JDK 6 Update 21 or later. Macintosh OSX 10.6 and later is supported as a development platform. However, there are certain configuration options in the base platforms that are important when running VoltDB.

1.1.

Disable Swapping

Swapping is an operating system feature that optimizes memory usage when running multiple processes. However, memory is a critical component of the VoltDB server process. Any contention for memory, including swapping, will have a very negative impact on performance and functionality.

We recommend using dedicated servers and disabling swapping when running the VoltDB database server process. Use the swapoff command to disable swapping on Linux systems. If swapping cannot be disabled for any reason, you can reduce the likelihood of VoltDB being swapped out by setting the kernel parameter vm.swappiness to zero.

1.2.

Turn off TCP segmentation offload and generic receive offload if cluster stability is a problem.

There is an issue where, under certain conditions, the use of TCP segmentation offload (TSO) and generic receive offload (GRO) can cause nodes to randomly drop out of a cluster. The symptoms of this problem are that nodes timeout — that is, the rest of the cluster thinks they have failed — although the node is still running and no other network issues (such as a network partition) are the cause.

Disabling TSO and GRO is recommended for any VoltDB clusters that experience such instability. The commands to disable offloading are the following, where N is replaced by the number of the ethernet card:

ethtool -K ethN tso off
ethtool -K ethN gro off

Note that these commands disable offloading temporarily. You must issue these commands every time the node reboots.

Upgrading From Older Versions

When upgrading from a previous version of VoltDB — especially with an existing database — there are a number of important notes that you should be aware of. Some changes to the structure and syntax of the VoltDB schema and deployment files may make old application catalogs and configuration files incompatible with newer versions.

Although incompatible changes are avoided wherever possible, some changes are necessary to add new features. It is always recommended that applications catalogs be recompiled when upgrading the VoltDB version. It is also important to note that the catalog is saved as part of snapshots and command logging. As a consequence, you must be careful to ensure an incompatible catalog is not loaded accidentally by starting a database with the recover action after an upgrade.

The process for upgrading VoltDB for a running database is as follows:

  1. Place the database in admin mode using the @Pause system procedure (or VoltDB Enterprise Manager).

  2. Perform a manual snapshot of the database (using @SnapShotSave).

  3. Shutdown the database (using @Shutdown).

  4. Upgrade VoltDB.

  5. Recompile the application catalog using the new version of VoltDB.

  6. Restart the database using the create option, the new catalog, and starting in admin mode (specified in the deployment file).

  7. Restore the snapshot created in Step #2 (using voltadmin restore).

  8. Return the database to normal operations (using voltadmin resume).

When using the Enterprise Manager, it is also recommended that you delete the Enterprise Manager configuration files (stored by default in the .voltdb subfolder in the home directory of the current account) when performing an upgrade.

Changes Since the Last Release

Users of previous versions of VoltDB should take note of the following changes that will impact their existing applications. The following is a list of changes since the last release.

Important

All existing VoltDB users are strongly recommended to upgrade to version 3.2.0.1 or later at their earliest possible convenience. These releases contain two major fixes:

  • A race condition was discovered as part of our ongoing internal stress tests. The issue, which could impact procedure invocation and exists in all versions prior to 3.2, is not known to have been encountered "in the wild". However, the potential consequences are severe enough that we recommend all users upgrade as soon as possible.

  • A problem exists in previous releases related to the use of live schema updates. If the new schema adds or removes columns from a unique index, the update can cause the database to stop running and make the command logs unusable for recovery.

For users of pre-3.0 releases of VoltDB who cannot or do not wish to upgrade to the latest version, a patch, version 2.8.4.5, correcting these issues is available. See the VoltDB news & announcements forum for details.

1. Release V3.3

1.1.

Non-Blocking Statistical System Procedures

Previously all VoltDB system procedures operated as database transactions. That is, system procedures executed as multi-partition transactions with two consequences:

  • Executing a system procedure would block database access until the system procedure completed.

  • If a long-running transaction was executing, the system procedure could not be performed.

Since statistical procedures do not change the state of the database, they are now executed separately from database transactions. This applies to the system procedures @Statistics, @SystemInformation, and @SystemCatalog. The consequences of this change are that the statistical system procedures have significantly less impact on database performance and, although the resulting statistics are not transactionally exact, they are an accurate reflection of overall database performance at the time the procedure executes. More importantly, it is now possible to get information about database operations even when an errant stored procedure or server is blocking other database operations.

1.2.

Outer and Explicit Joins

VoltDB now supports outer and explicit joins. Previously, all joins were implicitly inner joins (using a comma-separated list of table names). You can now use explicit join syntax to specify either inner or outer joins on specific columns using the ON or USING clause. The new syntax for the SELECT statement is as follows:

Select-statement [{set-operator} Select-statement ] ...

Select-statement:
SELECT [ TOP integer-value ]
{ * | [ ALL | DISTINCT ] { column-name | selection-expression } [AS alias] [,...] }
FROM table-reference [ join-clause ]...
[WHERE [NOT] boolean-expression [ {AND | OR} [NOT] boolean-expression]...]
[clause...]

table-reference:
{ table-name | view-name } [AS alias]

join-clause:
,table-reference
[ INNER | {LEFT | RIGHT} [OUTER] ] JOIN {table-reference} [join-condition]

join-condition:
ON conditional-expression
USING (column-reference [,...])

clause:
ORDER BY { column-name | alias } [ ASC | DESC ] [,...]
GROUP BY { column-name | alias } [,...]
LIMIT { integer-value [OFFSET row-count] | ALL }

set-operator:
UNION [ALL]
INTERSECT [ALL]
EXCEPT

For this initial release, outer joins support the joining of two tables only. Explicit and implicit inner joins support joining more than two tables.

1.3.

Improved String Parsing in Web Studio

Web Studio allows you to invoke stored procedures on a running database, entering procedure arguments as text. Previously, Web Studio had some difficulty parsing strings containing multiple embedded quotation marks. Argument parsing has been enhanced to support arbitrarily complex strings.

1.4.

Planner Improvements

This release includes a number of improvements to the planning of SQL statements. In particular, improvements have been made to how indexes are applied under various conditions.

2. Release V3.2.1

2.1.

@SnapshotStatus Reports on all Cluster Nodes

Previously, the @SnapshotStatus system procedure only returned results for the server processing the request, not all nodes in the cluster. Starting with this release, a call to @SnapshotStatus returns status information for all nodes in the cluster.

2.2.

Obfuscating Passwords in the Deployment File

By default, when enabling security the deployment file contains definitions of users and passwords. These definitions are included as plain text. If your environment or operating procedures require the deployment file to be secured against observation, you can preprocess the file to obscure the passwords using the voltdb mask command. These processed deployment files can then be used to start the VoltDB database. The syntax for the voltdb mask command is as follows:

$ voltdb mask deployment-file [new-deployment-file]

If you specify one file name, the file is processed in place. If you specify two file names, the first is used as the input file and the second as the output file. The processed deployment file contains an additional attribute on the <user> tag, plaintext="false" indicating to the VoltDB server that the passwords have been obscured.

2.3.

Improved Compiler Annotations

When VoltDB compiles the database schema, it displays a summary of the stored procedure queries as part of the compiler output. This output includes annotations identifying the type of procedure (read vs. write and single vs. multi-partitioned), potentially slow queries due to a sequential table scan, and possible dangers introduced by non-deterministic output. Several of these annotations have been changed to improve readability. The read/write annotation has been changed to [READ] and [WRITE] and [Seq] has been changed to [TABLE SCAN].

3. Release V3.2.0.1

3.1.

Correct Live Schema Updates that Change Unique Indexes

This release fixes a critical bug where updating the schema "on the fly" could cause the database to stop and make the command logs unusable for recovery. The issue only affects application catalogs that change a unique index by either adding or removing columns from the unique index (or primary key). Because of the seriousness of this bug, all users are strongly urged to upgrade to the latest version at their earliest possible convenience.

Until you upgrade, avoid updating the database catalog while the database is running if the update changes unique indexes. Note this issue only affects live updates. You can still update the schema using a maintenance window by performing the following sequence of commands: save, shutdown, create and restore.

4. Release V3.2

4.1.

Enhanced Support for Live Schema Updates

It is now possible to make many more changes to the schema "on the fly" — while the database is running — than in previous versions. You can now add, delete, or modify columns and indexes as well as tables. The only remaining limitations are that you cannot add a new constraint to an existing column or index and you cannot modify the definition of an existing view. (However, you can add and remove views.)

With this new functionality, the ability to perform line schema changes becomes a commercial feature. In other words, starting with V3.2 the @UpdateApplicationCatalog system procedure and the voltadmin update command are available in the Enterprise Edition only.

4.2.

Improved Performance and Resilience of Catalog Updates

In addition to new capabilities in live schema updates, the updates themselves have been improved in terms of performance and resilience to node failures. Updates to large catalogs, especially those with many stored procedures, are significantly faster than in previous versions. In addition, the update process has been hardened to avoid problems if a node fails during an update.

Warning concerning previous versions: If you perform catalog updates using previous versions of VoltDB, it is recommended that you take a snapshot prior to and immediately after the update as a precaution in case of database failure. Command logs and other durability features may be in an inconsistent state following a catalog update and before a new snapshot is taken. This issue is corrected in the current release.

4.3.

New Return Status for Snapshot Restore

It is possible for a snapshot restore (using the voltadmin restore command or @SnapshotRestore system procedure) to partially succeed. For example, if the schema has changed since the snapshot was created or some snapshot files are missing, only part of the data may be restored.

Previously, this condition was reported as success with details about what succeeded and what failed in the VoltTable response. To avoid confusion with a complete and successful restore, this function has been changed to report an error, OPERATIONAL_FAILURE, if one or more tables or partitions are not restored.

4.4.

Change to the Default Heartbeat Timeout

The default heartbeat timeout has been increased from 10 seconds to 90 seconds. In production, it has been found that network and process contention issues can make VoltDB mistakenly exceed the 10 second timeout, resulting in still functioning nodes being timed out. As a result, the default has been increased.

A consequence of this change is that it will now take longer for actual node failures to be recognized by the rest of the cluster, potentially stalling transactions until the failure is detected. You can change the timeout period using the <heartbeat> element in the deployment file or in the configuration dialog of the VoltDB Enterprise Manager. See the appendix on server configuration options in the VoltDB Management Guide for details.

4.5.

Bug Fixes

The following issues have been fixed:

  • Automated snapshots and node failure

    It was possible for automated snapshots to silently stop occurring after a node failed and rejoined the cluster. This did not happen all the time, but could not be corrected without restarting the cluster. This issue has been corrected.

  • The sqlcmd command and stored procedure names

    Previously, the sqlcmd command line tool could not invoke a stored procedure if the procedure name started with a SQL statement keyword, such as "select" or "delete". This issue has been corrected.

5. Release V3.1

5.1.

Database Replication Support.

V3.1 includes production-ready support for Database Replication (DR). The changes to transaction coordination introduced in V3.0 required a significant rewrite of the DR functionality. As a result, the initial V3.0 release included a beta release of DR only. This release returns DR to full operational status. Users who stayed with earlier VoltDB releases to use DR in production are now encouraged to upgrade to V3.1.

5.2.

New CAST() Function

V3.1 introduces a new function, CAST(), that lets you convert the datatype of an expression. CAST() can be very useful when performing operations on column values or function results that are not in the appropriate datatype. For example, you can use CAST() to convert the results of the FIELD() function, which always returns a string, to a numeric datatype. See Using VoltDB for more information.

5.3.

Additional JSON Functions

V3.1 also includes additional functions for traversing JSON data. The ARRAY_LENGTH() and ARRAY_ELEMENT() functions return the number of elements in a JSON array and a specific element from that array, respectively. These functions can be combined with other functions to retrieve specific items within the JSON structure. For example, to following SQL fragment retrieves the first element of an array in the JSON field named "options" from the column JSONDATA:

SELECT ARRAY_ELEMENT(FIELD(JSONDATA,"options"),0) AS first_option

See Using VoltDB for more information on using these functions.

5.4.

Changes to Log4J Components

The Log4J messages generated by VoltDB have been reorganized. Messages previously logged under the JOIN component are now logged under REJOIN and a new logger, SNAPSHOT, has been added for logging snapshot activity.

6. Release V3.0

6.1.

New Transaction Coordination Architecture

VoltDB 3.0 includes a new transaction coordination architecture that reduces latency and improves transaction throughput. Some of the benefits of VoltDB 3.0 include lower overall latency, higher throughput, and reduced dependency on NTP.

6.2.

Simplified Design and Development Process

V3.0 integrates much of the database configuration information into the schema, eliminating the need for a separate project definition file. Features that can now be defined in data definition language (DDL) include:

  • Stored procedure declarations and partitioning information

  • Table partitioning information

  • Export tables

  • Security roles

In addition, several new command line tools make the development process simpler and more consistent. New shell commands for compiling, starting, and managing VoltDB databases include voltdb, voltadmin and sqlcmd.

6.3.

Other New Features

In addition to improvements in transaction coordination and the development process, a number of new functional capabilities have been added. New features include:

  • Support for more export formats and running the export client on the database servers

  • Support for JSON as content using the SQL FIELD function

  • The ability to use functions and expressions in indexes

6.4.

Special Considerations for Existing Customers

Users of earlier versions of VoltDB should be aware of the following changes in behavior that could impact their applications:

  • You must recompile your application catalogs and upgrade databases when upgrading to V3.0.

  • Network partition detection is now enabled by default.

  • K-safety now requires exact multiples of partitions. In other words, sitesperhost * hostcount must be a whole multiple of K+1.

Known Limitations

The following are known limitations to the current release of VoltDB. Workarounds are suggested where applicable. However, it is important to note that these limitations are considered temporary and are likely to be corrected in future releases of the product.

1. SQL and Stored Procedures

1.1.

Two identical aggregates in a SELECT statement result in only one value being returned.

If your SELECT statement includes two aggregates of the same column, VoltDB will return only one column value as part of the result set. For example, if your SQL statement is SELECT COUNT(ColumnA), COUNT(ColumnA) FROM MyTable, the resulting VoltTable will have only one column value per row. The workaround is to either not request the same value twice or aggregate on different columns (for example, SELECT COUNT(Column10), COUNT(Column2)).

1.2.

Selection expressions involving arithmetic on aggregate functions are not allowed.

In SELECT statements, the selection expression can include columns, aggregate functions (such as COUNT), or arithmetic expressions involving columns (such as Col1 + Col2). However, they cannot include arithmetic involving aggregate functions (such as SELECT SUM(Price) + 2). Using aggregate functions in an arithmetic expression results in an error when you compile the application catalog.

The workaround is to use the aggregate in the SELECT statement and then perform the additional arithmetic on the result set either in the stored procedure or in the client application after the transaction completes.

1.3.

SELECT DISTINCT using multiple columns or expressions is not supported.

Use of SELECT DISTINCT is supported for a single column (such as SELECT DISTINCT Price FROM Inventory). However, using DISTINCT with multiple columns or arithmetic expressions is not currently supported. For example, the following SELECT DISTINCT statements should not be used:

SELECT DISTINCT Price, Discount FROM Inventory
SELECT DISTINCT (Price - Discount) FROM Inventory

1.4.

Do not use assertions in VoltDB stored procedures.

VoltDB currently intercepts assertions as part of its handling of stored procedures. Attempts to use assertions in stored procedures for debugging or to find programmatic errors will not work as expected.

1.5.

Views of partitioned tables must include the partitioning column in the GROUP BY clause.

When creating a view of a partitioned table, using the CREATE VIEW statement, the column on which the table is partitioned must be included in the GROUP BY clause. If not, queries of the view using SELECT ... FROM can give incorrect results. Note that VoltDB does not detect or warn you of this condition. Be sure to include the partitioning column in the GROUP BY clause when creating a view on a partitioned table.

2. Client Interfaces

2.1.

Avoid using decimal datatypes with the C++ client interface on 32-bit platforms.

There is a problem with how the math library used to build the C++ client library handles large decimal values on 32-bit operating systems. As a result, the C++ library cannot serialize and pass Decimal datatypes reliably on these systems.

Note that the C++ client interface can send and receive Decimal values properly on 64-bit platforms.

3. Runtime Issues

3.1.

Partially removing snapshot files from the database servers can cause recovery to fail.

To ensure proper recovery on startup, either from command logs or the last database snapshot, make sure all snapshot files — or at least complete subsets of the snapshot files — are available on the nodes of the cluster. If you delete or move snapshot files (for example, copying all snapshot files to a single node) be sure to keep all of the files for each node together. Do not selectively delete or move individual files or else the recovery may fail.

3.2.

Adding a column with DEFAULT NULL to the application catalog causes restore to fail.

You can modify the database schema of an existing database by recompiling the application catalog with the changes, saving the current database contents, restarting with the modified catalog and then restoring the contents. However, if the modified catalog adds a column with the default value specified as NULL, the database will fail when you attempt to restore a snapshot created using the old catalog. The workaround, for the time being, is to not specify a default value for the new column, specify a default other than NULL, or save the database contents as CSV files and use the csvloader utility to load each table.

3.3.

Snapshot Restore Cannot Change a Table from Partitioned to Replicated or Vice Versa.

You can add and remove tables on the fly using voltadmin update or the @UpdateApplicationCatalog system procedure. To modify individual tables, such as adding or removing columns, you must save a snapshot, create a new database with the modified catalog, and then restore the snapshot. However, this latter method does not currently support changing a column from partitioned to replicated, or the reverse.

The workaround for changing a partitioned table to replicated or a replicated table to partitioned is the following:

  1. Make the desired changes to the schema and compile a new application catalog.

  2. Save the database as CSV files, using voltadmin save --format=csv .

  3. Collect the resulting CSV files.

  4. Create a new database using the modified catalog and starting in admin mode.

  5. Use csvloader to load the individual table data from the CSV files.

  6. Reset the database to normal operation using voltadmin resume.

Implementation Notes

The following notes provide details concerning how certain VoltDB features operate. The behavior is not considered incorrect. However, this information can be important when using specific components of the VoltDB product.

1. SQL

1.1.

Do not use UPDATE to change the value of a partitioning column

For partitioned tables, the value of the column used to partition the table determines what partition the row belongs to. If you use UPDATE to change this value and the new value belongs in a different partition, the UPDATE request will fail and the stored procedure will be rolled back.

Updating the partition column value may or may not cause the record to be repartitioned (depending on the old and new values). However, since you cannot determine if the update will succeed or fail, you should not use UPDATE to change the value of partitioning columns.

The workaround, if you must change the value of the partitioning column, is to use both a DELETE and an INSERT statement to explicitly remove and then re-insert the desired rows.

1.2.

Certain SQL syntax errors result in the error message "user lacks privilege or object not found" when compiling the runtime catalog.

If you refer to a table or column name that does not exist, the VoltDB compiler issues the error message "user lacks privilege or object not found". This can happen, for example, if you misspell a table or column name.

Another situation where this occurs is if you mistakenly use double quotation marks to enclose a string literal (such as WHERE ColumnA="True"). ANSI SQL requires single quotes for string literals and reserves double quotes for object names. In the preceding example, VoltDB interprets "True" as an object name, cannot resolve it, and issues the "user lacks privilege" error.

The workaround is, if you receive this error, to look for misspelled table or columns names or string literals delimited by double quotes in the offending SQL statement.

2. Runtime

2.1.

File Descriptor Limits

VoltDB opens a file descriptor for every client connection to the database. In normal operation, this use of file descriptors is transparent to the user. However, if there are an inordinate number of concurrent client connections, or clients open and close many connections in rapid succession, it is possible for VoltDB to exceed the process limit on file descriptors. When this happens, new connections may be rejected or other disk-based activities (such as snapshotting) may be disrupted.

In environments where there are likely to be an extremely large number of connections, you should consider increasing the operating system's per-process limit on file descriptors.

3. Logging

3.1.

All logging messages reported by the VoltDB server are timestamped using GMT (Greenwich Mean Time).

This is not a problem when looking at VoltDB logs separately. However, you should be aware of this distinction when integrating logging of VoltDB with logging of other system components that use the local time zone (rather than GMT). You may want to convert one or the other log streams so the time zones match.

3.2.

To simplify logging, a file has been added to the distribution listing all of the VoltDB logging categories.

The file voltdb/log4j.xml lists all of the VoltDB-specific logging categories. It also serves as a useful logging schema. The sample applications and the VoltDB shell commands use this file to configure logging and it is recommended for new application development.