We’ve just released VoltDB v5.0!
With monthly releases of VoltDB, you might think that a major numeric release would make for small news. But the changes we’re announcing in v5.0 go beyond incremental improvements to our core technology – VoltDB v5.0 makes building Fast Data applications easy.
So what, exactly, did we do?
First off, we eliminated the VoltDB catalog. This means you can now start an empty VoltDB database, connect to it and create your database schema dynamically. You no longer need to define your database and stored procedures ahead of time. In short, building a high performance database with VoltDB v5.0 is much more similar to building a traditional database.
Second, we delivered data importers and exporters, making it easy to feed streams of data into VoltDB, and after VoltDB processes the data, easy to export the data to downstream historical archives like Hadoop.
For streaming front-end integration, VoltDB v5.0 connects to message queue tools like Kafka. On the back end, VoltDB v5.0 efficiently exports data to Hadoop or your OLAP system of choice for historical archiving. In between, VoltDB v5.0 provides high velocity, transactional ingestion of data and events, provides real-time analytics on windows of streaming data, and allows for low-latency, per-event decision-making - the ability to react and apply business decisions to individual events. We call this the Fast Data Pipeline. Read more about it on our recently released book .
Features of v5.0 include:
- New integrations into the Hadoop/Big Data ecosystem designed to help ingest streaming data, process it within VoltDB, and export data seamlessly to a historical data warehouse. Export integrations include Kafka; HDFS Export; HTTP Export; and RabbitMQ Export. Import integrations include the Kafka Loader; the JDBC Loader; the VoltDB Hadoop OutputFormat implementation; a Vertica UDx; and support for Apache Hive and Apache Pig.
- Support for exporting data in Avro format.
- Enhanced SQL support (SQL 92)
- Capped Collections – age out table data automatically.
- Query timeout – prevent runaway queries.
- VoltDB Management Center (VMC) - browser-based, one-stop monitoring and configuration management of your VoltDB database.
In addition, we’ve put together two new sample apps to let you experience for yourself how well VoltDB v5.0 works. Check out the Fast Data Pipeline and Real-time Analytics Example Applications.
Download v5.0 today, and let us know what you think!
In-Memory Performance with On-Disk Durability and High Availability
VoltDB’s in-memory architecture is designed for performance. It eliminates the significant overhead of multi-threading and locking responsible for the poor performance of traditional RDBMSs that rely on disks.
VoltDB is also designed to ensure that data is never lost. Being an in-memory database, a frequent question is “can data be lost?” Ensuring that data would never be lost was a foundational requirement when VoltDB was designed. VoltDB’s Snapshots and Command Logging features allow you to fully recover quickly and easily. Just bring your database back up and VoltDB will do all of the heavy lifting – restoring physical data from snapshots, rebuilding indexes, and replaying transaction logs. VoltDB will have you back to normal operations in no time.
VoltDB snapshots a consistent point-in-time view of the in-memory data and serializes it to local disk. Snapshots are written at each server and are consistent across servers. Read more about snapshots.
To protect data between snapshots, VoltDB logs transaction invocations to disk. VoltDB refers to this as the command log. Command logs are also written at each server. To recover, the snapshot is restored and the command log is replayed. Together snapshots and command logs create durable, replicated copies of the database across all servers. Read more about command logs.
High Availability (HA)
VoltDB was designed for HA from the ground up. It’s easy to configure and completely transparent to your applications. Partitions are transparently replicated (active/active and synchronous) on multiple servers, so if a server fails, all data remains available, consistent, and durable for continued operation.
Transparent Scalability with Data Consistency (ACID)
VoltDB's fundamental redesign of the RDBMS provides unparalleled performance and scalability on bare-metal, virtualized and cloud infrastructures.
VoltDB uses a shared-nothing architecture to achieve database parallelism. Data and the processing associated with it are distributed among all the CPU cores within the servers composing a single VoltDB cluster. By extending its shared-nothing foundation to the per-core level, VoltDB exploits and scales with the increasing core-per-CPU counts on modern commodity servers.
Scaling server capacity is easy and 100% transparent to your application. Simply add servers to scale throughput and storage capacity -- no need to build complex and costly sharding layers. You can build your applications with the confidence that they’ll scale to meet increasing workloads.
And VoltDB is ACID compliant meaning you don’t have to trade data consistency to achieve performance and scale. Transactions are guaranteed. Your data will be 100% accurate, 100% of the time.
Standards - SQL and Java, Integrations, and Client Support
VoltDB combines the richness and flexibility of SQL for data interaction with a modern, distributed, fault-tolerant, cloud-deployable clustered architecture while maintaining the ACID guarantees of a traditional database system.
VoltDB supports the JSON data type for agility in the application development process. VoltDB also supports several client access methods including stored procedures, JDBC and ad hoc queries. Stored procedures provide the fastest processing of data as queries are moved to the data for processing.
Integrations and Client Support
Recognizing the importance of working together in a broader software ecosystem, VoltDB supports a wide range of integrations include JDBC (Java Database Connectivity) and ODBC (Open Database Connectivity) for data exchange. In addition to the tools and system procedures that VoltDB provides for monitoring the health of your database, you can also integrate this data into third-party monitoring solutions so they become part of your overall enterprise monitoring architecture.
VoltDB also provides drivers and SDKs to help connect applications to respective languages. Read more about clients and monitoring here.
VoltDB Integrations in the Data Warehouse Ecosystem
VoltDB offers a broad set of Big Data ecosystem integrations, certifications, industry partnerships and connectors to enable high-speed data export to Hadoop-based data warehouses and long-term analytics stores such as HP Vertica and IBM Netezza.
VoltDB Big Data integrations enable developers to take advantage of the speed and cyclical nature of the import-export data pipeline.
Partners and Certifications
Hortonworks is a leading commercial vendor of Apache Hadoop, the open source platform for storing, managing and analyzing Big Data. The Hortonworks Data Platform distribution of Apache Hadoop provides an open and stable foundation for enterprises and a growing ecosystem to build and deploy Big Data solutions. VoltDB is a Certified Hortonworks partner.
Cloudera offers an enterprise-class implementation of Apache Hadoop. The company’s Cloudera Enterprise helps developers benefit from the experience of the open source and Big Data/Hadoop communities. Cloudera Enterprise includes CDH, the world’s most popular open source Hadoop-based platform, as well as advanced system management and data management tools. VoltDB is a Certified Cloudera partner.
MapR - MapR provides developers with an enterprise-grade Hadoop platform. MapR offers dependability, ease-of-use and world-record speed to Hadoop, NoSQL, database and streaming applications in one unified distribution for Hadoop. VoltDB is a MapR Advantage Technology partner.
IBM Netezza – VoltDB’s IBM Netezza Export client uses the JDBC Connector to fetch transactional data from VoltDB. The data is written in batches to the Netezza data warehouse. Configuring this behavior is simple and requires no programming. Users automate the export process by identifying the specific VoltDB tables in the schema as sources for export data. At runtime, any data written to the specified tables is automatically sent to the VoltDB export connector, which manages the exchange of the updated information to the Netezza destination. The VoltDB export process transactionally queues export data to the connector automatically. The export client uses a series of poll and acknowledgement requests to transactionally exchange data between VoltDB and Netezza, guaranteeing at least one delivery of the data to the destination system. The export client runs within the VoltDB cluster, so it, like VoltDB, is highly available. IBM is a partner. Read more about the JDBC Connector in our documentation.
HP Vertica – VoltDB’s HP Vertica Export client uses the VoltDB JDBC Connector to fetch transactional data from VoltDB and write it, in batches, to the Vertica data warehouse. Configuring this behavior is simple and requires no programming.
Users automate the export process by identifying the specific VoltDB tables in the schema as sources for export data. At runtime, any data written to the specified tables is automatically sent to the VoltDB export connector, which manages the exchange of the updated information to the Vertica destination. The VoltDB export process transactionally queues export data to the connector automatically. The export client uses a series of poll and acknowledgement requests to transactionally exchange data between VoltDB and Vertica, guaranteeing at least one delivery of the data to the destination system. The export client runs within the VoltDB cluster, so it, like VoltDB, is highly available. HP Vertica is a partner. Read more about the JDBC Connector in our documentation.
VoltDB supports a wide range of export connectors to support integration with other data management components including CSV, WebHDFS/Hadoop, Kafka, RabbitMQ, and JDBC. The JDBC Connector provides export to data warehouse technologies such as IBM Netezza and HP Vertica. VoltDB also provides developers with simple-to-use examples and instructions to build custom, open-source export connectors. VoltDB Export enables data to arrive in your analytic store sooner, and allows deep analytics to be leveraged with radically lower latency. Read about VoltDB Export in our documentation.
VoltDB Connectors, message queues and interfaces
VoltDB serves as a real-time application database used in conjunction with Hadoop and analytical results derived from Hadoop in applications including real-time scoring, policy enforcement, and customer interaction. VoltDB provides the ability to ingest data as fast as it arrives; perform real-time analytics in-memory; make automated decisions in real time; and continuously pass, or export, processed data into Hadoop. For more about Hadoop integrations, click here.
- Apache Kafka Connector
VoltDB’s Apache Kafka Export, paired with the Kafka importer utility, allows developers to build applications in which VoltDB can both transact on incoming Kafka messages and also deliver data and alerts to Kafka feeds down stream, enabling VoltDB applications to analyze and make decisions on data in the moment.
Apache Kafka is a persistent, high performance, distributed message queue/service. Kafka is highly available, partitions (or shards) messages, and is simple and efficient to use. Useful for serializing and multiplexing streams of data, Kafka provides "at least once" delivery, and gives clients (subscribers) the ability to rewind and replay streams.
In the Apache Kafka model, VoltDB export acts as a Producer. The Kafka connector receives serialized data from the export tables and writes it to a message queue using the Apache Kafka version 0.8 protocols.
Both Kafka and VoltDB are built around shared-nothing clustering. Load is distributed among cluster nodes for performance. Data is replicated among cluster nodes for safety and availability. To handle increasing loads, nodes can be transparently added to the cluster. Nodes can fail or be removed and the remaining cluster will continue to function. Both systems are designed without single points of failure. These features are the hallmark of systems designed for scale.
Kafka is one of the most frequently used streaming vehicles in the Big Data application space. Because of its persistence capabilities, it is often used to front-end Hadoop data feeds. Read more about VoltDB and the Kafka Connector in our documentation.
The integration of RabbitMQ with VoltDB expands developers’ options to export data from VoltDB. RabbitMQ is a popular, scalable, asynchronous message queueing service that supports multiple platforms, multiple languages, and multiple protocols, including AMQP.
RabbitMQ is in wide use in enterprises developing and running applications in the cloud. Read more about the RabbitMQ Connector in our documentation.
Hive is a data warehouse application that can query large datasets held in distributed memory. Hive is a runtime Hadoop support structure that allows developers fluent with SQL to leverage the Hadoop platform with minimal effort.
VoltDB v5.0 includes a VoltDB Hadoop OutputFormat implementation, which can be used to import job data from Hadoop into VoltDB. This is further leveraged by our Hadoop connectors for both Apache Pig and Apache Hive.
VoltDB developers can export VoltDB data in Pig format to Hadoop. Pig was developed to enable developers using Hadoop® to focus on analyzing large data sets and spend less time writing mapper and reducer programs.
Pig includes two components: the PigLatin language, and a runtime environment where PigLatin programs are executed.
Avro provides a serialization and data exchange format for Hadoop. Avro is used natively by Hadoop utilities such as Pig and Hive. Because it is a binary format, Avro data takes up less network bandwidth than text-based formats such as CSV, providing VoltDB developers with efficiencies when moving processed data out of Hadoop and into VoltDB. Avro features rich data structures; a compact, fast, binary data format; a container file to store persistent data; and Remote Procedure Call (RPC).
Avro offers simple integration with dynamic languages. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation is an optional optimization, best for implementation for statically typed languages.
Avro was designed for developers who prefer strongly typed data serialization or protocol buffer-style tools but who want the flexibility of easy interoperability with dynamic languages.
- Creating a Custom Export
It is also possible to create a custom export connector that runs inside VoltDB. Click here for instructions.
VoltDB is the only open source in-memory NewSQL database. VoltDB was created as an open source project by co-founder Dr. Michael Stonebraker. Once the decision was made to launch a company to support the project, we worked to build an organization committed to developer and customer success with our products.
Why is it important for VoltDB to be open source? The open source community values integrity, participation, the open exchange of ideas, shared purpose, and support for the best ideas. We share those values: our company culture is open and collaborative. We work closely with customers, establishing open, supportive relationships between our developers and our customers’ developers.
VoltDB is developed using the Agile methodology; we believe in rapid prototyping and push updates out to customers on a monthly schedule. Our developers integrate VoltDB with key open source ecosystem components including Kafka, Rabbit MQ, Docker, and Hadoop, with more in the pipeline. We also contribute to projects including Kafka, Rabbit MQ, and Nagios. Our customers and community members contribute to VoltDB’s open source project on GitHub, helping to improve the quality and robustness of the project and the downstream product.
We make our code available under the popular AGPL license, and offer a free community edition which is used by many community members and educational organizations. We encourage prospects to inspect our code using the community edition.
With VoltDB, our users have the best of both worlds: a freely-available community edition built with open source methodologies, and a robust, commercial-strength product. We invite you to take a look at the community edition, or give us a call to discuss how the commercial edition can meet your needs.