When we tell people how fast VoltDB is - millions of multi-statement ACID transactions per second, with full disk persistence - they often ask us, "What's the catch?"
The purpose of this overview is to explain the how behind VoltDB performance. What are the innovations? Where are the tradeoffs?
Clustering & Scalability
VoltDB is a shared-nothing, cluster-native relational database. There are several good reasons to build systems this way.
First, clusters imply redundancy and robustness. In a cluster of independent processes on independent hardware, partial failure doesn't have to mean a halt to processing or that data will be lost. Furthermore, administrators can specify how redundancy is balanced against hardware cost and performance.
Much has been written about the exponential growth of data coming from an increasing number of sources - mobile devices, web, M2M and more. In the 1970s, when electronic data was first exploding, the systems available to manage that data were woefully inadequate.
VoltDB is a distributed data software infrastructure. It is a core component of many fast data pipelines, ingesting data from myriad sources; performing streaming analytics on incoming streams of data; and managing large volumes of transactions on live data in real-time.
What are the jobs of your operational database? Why do you use one? What pain does it mitigate?
At a minimum, VoltDB believes an operational database should be able to:
- Reliably store and protect your data.
- Manage concurrent read and write access to that data.
- Make it easier to build really powerful data-based applications.
VoltDB is designed to be the fastest operational database available, and it's also designed to be the safest. While other systems focus on either replication or disk persistence for data safety, VoltDB provides solutions for both approaches to protecting your data. Durability and high availability options include combining disk and replication and tuning safety settings by evaluating the tradeoffs between latency and resource cost. In addition to intra-datacenter replication, VoltDB offers Active/Active and Active/Passive datacenter replication. Active/Active replication enables multiple database instances, including geographically-distributed instances, to support the same application. Whichever approach you pick, VoltDB offers the lowest latency and lowest cost per operation available.
VoltDB was designed as a specialized system for operations and real time analytics. By avoiding the tradeoffs that come with general purpose RDBMSs, VoltDB can perform many times faster while offering the strongest consistency guarantees.
As a newer system, and one explicitly designed to work with other tools to provide comprehensive solutions, ease of administration, monitoring, and integrating are fundamental.
VoltDB's customers frequently deploy the database system in public clouds. These systems are first-class supported platforms. While Amazon Web Services is the most popular among our users, we have customers using Azure, SoftLayer and others. We also have customers that deploy to cloud-style private platforms.
Stored Procedures lower the cost of performing complex operations by making it possible to process a sequence of SQL statements in a single transaction. By bringing processing to the data, stored procedures make it possible for VoltDB to be very fast.
Stored procedures have many benefits, but aren't always well-understood. Therefore, VoltDB doesn't require users to use stored procedures. There are many ways to directly send SQL to VoltDB; some of our customers never use a single stored procedure. Each individual SQL statement is still 100% transactional ACID.