Resources

paper-icon-miniTechnical Overview
Download

VoltDB is a ground-up redesign of the relational database for today’s growing real-time analytics and data challenges.  Architected by Dr. Mike Stonebraker, the historical roots of VoltDB are built on the groundbreaking work of database researchers at MIT, Brown and Yale.

VoltDB offers a modern NewSQL in-memory database that is scalable to easily handle fast data, powerful to make it smart, and fault tolerant in both back-room and cloud environments. By leveraging the following architectural elements, VoltDB database is able to achieve performance, scaling, and high availability never before available:

  • Automatic partitioning (sharding) across a shared-nothing server cluster
  • Memory-centric design
  • Automatic replication and disk persistence for high availability
  • Data interaction – Relational SQL, JSON data type, JDBC, Ad hoc and stored procedure interfaces
  • Integrated Export System for connection to analytic systems
  • ACID consistency

Automatic Partitioning Across a Shared Nothing Cluster

VoltDB uses a shared-nothing architecture to achieve database parallelism. Data and the processing associated with it (in the form of a single-threaded VoltDB execution “engine”) are distributed among all the CPU cores within the servers that comprise a VoltDB cluster. By extending its shared-nothing foundation to the per-core level (“virtual nodes”), VoltDB is able to scale well on commodity hardware, private or public clouds as well as higher core-density CPUs.


In-memory architecture

Speed is achieved partly by using memory as the storage location for all active data. But, running in-memory alone isn’t sufficient to achieve more than a modest performance improvement. Because VoltDB was designed to run in-memory, its architecture eliminates multi-threading and locking overhead, a substantial reason for poor database performance. Each partition stores its associated data in main memory and processes requests data by its associated, single-threaded execution engine. By eliminating the disk waits, single-threaded execution is able to achieve extraordinary performance improvements.


Replication and Disk Persistence

VoltDB clusters will automatically replicate all work on servers within the cluster. All servers act as equal participants in the cluster while supporting a user-selectable level of redundancy through transactional replication. If a server fails, the cluster continues operating uninterrupted. All servers within the cluster are also capable of maintaining a disk persisted copy of data. This is accomplished through periodic snapshots as well as command logging where transactional instructions are persisted to disk to prevent all transaction loss in the event of power failure.


Data Interaction

SQL data management is VoltDB’s access model. Data tables are partitioned according to their use and the VoltDB planner manages all SQL interactions with the individual servers of the cluster. JSON data types are supported within VoltDB to support agility in the application development process.

VoltDB supports several client access methods including stored procedures, JDBC and ad hoc queries. Stored procedures provide the fastest processing of data as the queries are moved to the data for processing. Client libraries in a variety of popular programming languages are available for writing VoltDB applications with stored procedure.


Export System

Built into the VoltDB architecture is a sophisticated export system that supports the preferred system architecture that many data application stacks require. One system provides the operational data capabilities (interactions, transaction, real-time analytics) while a second system provides deep analytics (reporting, complex analysis, large storage capacities). See diagram:

The VoltDB export system is a loose coupling managed from within the VoltDB application. Data is continuously and transactionally moved from the VoltDB system to a data warehouse or Hadoop system of choice. The application has complete control as to when and what data moves to the external system.

diagram