What is VoltDB?
What are good use cases for VoltDB?
How does VoltDB achieve ACID compliance?
What server platforms does VoltDB support?
What SQL does VoltDB support?
What is the architecture of VoltDB?
How does VoltDB scale?
How does VoltDB partition a database?
How are VoltDB partitions different from sharding a traditional database?
Which programming languages can be used to build applications that access VoltDB?
How do you increase the size of a VoltDB database cluster?
What is the maximum cluster size that VoltDB supports?
Does VoltDB come with management tools?
What database monitoring tools are available?
Does VoltDB compress in-memory data?
How does VoltDB Disaster Recovery work?
How does VoltDB ACID durability work?
How does VoltDB differ from MySQL used with memcached?
How does VoltDB differ from Key-Value stores?

What is VoltDB?

VoltDB is a relational SQL database for applications that require real time decisions and real time analytics even while handling fast incoming data feeds. VoltDB offers:

  • Orders of magnitude better performance than conventional DBMSs
  • Linear scaling on commodity 64-bit Linux servers
  • SQL as the DBMS interface
  • ACID transactions to ensure data consistency and accuracy
  • Built-in high availability
  • Built-in database crash recovery
  • Database replication for disaster recovery, hot stand-by and workload optimization
  • Prepackaged developer tools, Key-Value and memcached reference interfaces
  • Consoles for database provisioning, monitoring and management

What are good use cases for VoltDB?

VoltDB is used today for traditional high performance applications such as capital markets data feeds, financial trade, telco record streams and sensor-based network systems. It’s also used in emerging applications like wireless, online gaming, fraud detection, digital ad exchanges and micro transaction systems. Any application requiring high database throughput, linear scaling and uncompromising data accuracy will benefit from VoltDB.

How does VoltDB achieve ACID compliance?

ACID stands for Atomicity, Consistency, Isolation, and Durability — the cornerstones of database transaction processing. VoltDB is a strict ACID system.

  • Atomicity: VoltDB defines a transaction as auto-commit batches of ad-hoc SQL, as DDL defined stored procedures or, for complex business logic, as Java stored procedures. Transactions are either fully applied or fully rolled-back.
  • Consistency: VoltDB enforces schema and datatype constraints in all database queries including primary and secondary indexes, including unique indexes. VoltDB does not currently support foreign keys or triggers.
  • Isolation: VoltDB transactions are serializably ordered and run to completion on all affected partitions without interleaving producing linearizable inter-transaction isolation.
  • Durability: VoltDB provides active/active intra-cluster replication of partitions in-memory (referred to as K-safety) and periodic database snapshots to disk combined with command logging to disk to ensure high availability and database durability. Additionally, VoltDB’s Database Replication feature provides durability and business continuity across data centers for enterprise-class disaster recovery strategies.

What server platforms does VoltDB support?

VoltDB requires a 64-bit linux server for production use. Basic minimum requirements are:

  • 1 or 2 socket servers with multi-core processors (typically 2U, 1U or half-width can be used)
  • CPU Architecture: (x86-32/64) 64-bit x86_64 processor
  • CPU cores: Dual core or greater 1.6 GHz or greater
  • Memory (DRAM): 4GB or greater (typically higher capacities)
  • Network Cards: Recommend each server has 2 NICs or vNICs, (internal and external interfaces). For most use cases 1GbE is sufficient. 10GbE is recommended for large record sizes
  • Storage: Recommend disk controller with BBWC (battery-backed write cache). Local SATA or SAS 10K disks, dedicated disk for command log
  • OS Version and Release: RHEL 5.8+ and 6.3+, CentOS 5.8+ and 6.3+, Ubuntu 10.4+ and 12.4+, OS X 10.6+ (development-only)

What SQL does VoltDB support?

VoltDB supports a subset of ANSI-standard SQL 99, including the CREATE INDEX, CREATE TABLE, and CREATE VIEW statements for schema definition and SELECT, INSERT, UPDATE, and DELETE for data manipulation. Additional SQL syntax will be added over time as the needs of users and customers dictate.

See the Using VoltDB manual for details on the specific SQL syntax that the current version of VoltDB supports.

What is the architecture of VoltDB?

The VoltDB Technical Whitepaper describes the main concepts behind VoltDB’s modern architecture.

How does VoltDB scale?

VoltDB automatically partitions frequently accessed database tables across the available cluster nodes. Both the capacity and performance of the database can be increased by adding nodes to the cluster. Upon changes to cluster size, VoltDB automatically redistributes the partitions to the new configuration when you reload the data. VoltDB also allows tables with infrequently-changing data to be replicated to each node to further optimize performance.

How does VoltDB partition a database?

For partitioned tables, VoltDB distributes the rows across the partitions using a hash scheme. The user identifies, for each partitioned table, which column is used as the input to the internal hashing function.

How are VoltDB partitions different from sharding a traditional database?

In sharding, database partitions are actually separate, unrelated database instances. It is the responsibility of the application code to know what shard contains specific data as well as to manage the complexities of any queries that require data from multiple shards. More importantly, there is no guarantee of data or transactional consistency within the database system. All consistency logic must be written into the application. With VoltDB, the database engine transparently provides partition management, cross parition data access and full ACID-compliance across the entire database and all partitions.

Another cost of sharding is the complexity of managing the individual database instances. Backup, recovery, and all other management tasks must be performed separately for every node. With VoltDB, these management operations are managed centrally and transparently.

Which programming languages can be used to build applications that access VoltDB?

VoltDB provides client libraries for Java, C++, C#, PHP, Python and Node.js. The VoltDB community has also developed client libraries for Erlang, Go and Ruby.

How do you increase the size of a VoltDB database cluster?

The size of the database cluster is defined when you compile the application catalog and start the database. To increase the cluster size, you simply need to: Save the current database contents to disk (using the SnapshotSave system procedure). Edit the deployment file, specifying the increased number of cluster nodes (in the hostcount attribute). Restart the database cluster, using the new deployment file. Reload the data from disk (using the SnapshotRestore system procedure).

What is the maximum cluster size that VoltDB supports?

There is no architectural limit to the number of nodes in a VoltDB cluster. That said, people often think of performance as a comparative value proportional to cluster size. And although it is true that VoltDB’s throughput scales linearly, it is also true that VoltDB’s initial performance on a single node is 50 to 100 times greater than comparable database products. As a consequence, it is possible to achieve throughput rates of over a million transactions per second on a cluster with as few as 12 nodes.

VoltDB is regularly tested on and tuned for clusters of 6 to 12 nodes and has scaled linearly to 3.4 million TPS on 30 nodes. If you are considering running a large VoltDB cluster, please contact us – we’d love to help users design high-scaling VoltDB infrastructures.

Does VoltDB come with management tools?

The VoltDB Enterprise Edition includes a browser-based management tool called the VoltDB Enterprise Manager. The VoltDB Enterprise Manager helps you deploy and control a VoltDB database in a cluster environment. You can start and stop the database, update the schema and stored procedures, and manage disk-based snapshots of the data from a single console interface. See the VoltDB Management Guide for details.

What database monitoring tools are available?

The VoltDB Enterprise Manager provides performance and activity monitoring, in addition to its management and control functionality. The browser-based console interface provides real-time statistics on the number of records in each partition, as well as measurements of throughput and latency. For those using the Ganglia monitoring tool, the VoltDB Enterprise Manager also exports performance data to Ganglia automatically.

For those using the VoltDB Community Edition, there are system procedures that can provide similar information through the callable interface, such as @Statistics and @SystemInformation.

Does VoltDB compress in-memory data?

VoltDB does not compress data stored in memory. It can store externally-compressed data in binary data type fields. Snapshots to disk and data streamed for cluster replication are compressed.

How does VoltDB Disaster Recovery work?

VoltDB has a feature called Database Replication which enables a VoltDB cluster to be
replicated to a replica cluster over the network (such as a WAN) for additional redundancy
such as for disaster recovery. VoltDB database replication uses an agent program (DR Agent)
to copy a compressed snapshot of the entire data set from the master cluster to the replica,
and then to stream the subsequent write commands continuously. The replica cluster can
execute separate read-only transactions from client connections. In the event of a loss
availability of the master cluster, the replica cluster can be promoted to become the new
master cluster and applications using the database can failover to this new master. This is a
simple command that can be automated using scripts or management tools, or can be
performed as a manual process to support different operations policies.

How does VoltDB ACID durability work?

VoltDB writes each incoming request to a command log on disk, capturing the very latest
incoming transactions. The commands are collected and written in small high frequen
batches, and the fsync write confirmation is synchronized with the response to ensure full
durability. When the command log reaches a configured size, a snapshot is taken so that the
command log will not grow indefinitely. A snapshot is a point-in-time consistent copy of the
entire data in memory, which is written in a compressed binary format.

Command logging runs best when there is a fast fsync, so a dedicated disk and disk controlle
with BBWC is recommended for the synchronous mode, and when this hardware is
employed the command log will not significantly affect throughput or latency. An
asynchronous mode is optional.

When recovering from disk, VoltDB loads the latest snapshot directly into memory and then
plays back the command log to reach to the latest state.

This has a number of performance advantages over the disk operations that traditional
database architectures use. VoltDB can write the commands as soon as they arrive, without
waiting for the transaction to be executed. There is no need to record the before and after
states, only the inputs. The before state is maintained in memory for one transaction at a
time during transaction execution. All the disk operations are sequential.

How does VoltDB differ from MySQL used with memcached?

Memcached is a distributed in-memory cache. It provides none of the reliability or consistency of an ACID-compliant SQL database. Memcached is often used as a cache in front of MySQL to improve performance of read operations. This requires the client application to manage the hash algorithms for both memcached and MySQL, as well as to handle the chores of cache synchronization.

VoltDB automates all of these functions with none of the penalties, while providing wildly superior performance. In addition, caching can help improve read performance for products such as MySQL, but does not help scale write performance. VoltDB scales linearly for both read and write operations.

VoltDB includes a reference implementation, called VoltCache, that provides equivalent functionality to memcached. Developers familiar with memcached often use VoltCache for POCs and as a starting point for application development.

How does VoltDB differ from Key-Value stores?

Key-Value stores are a mechanism for storing arbitrary data (i.e. values) based on individual keys. Distributing Key-Value stores is simple, since there is only one key. However, there is no structure within the data store and no transactional reliability provided by the system.

VoltDB provides the ability to store either structured or unstructured data, while at the same time providing full transactional consistency and reliability. VoltDB can even define a transaction that includes reads and writes across multiple keys. Finally, VoltDB provides comparable or better performance in terms of throughput.