Skip directly to content

Fast Data + Big Data = VoltDB + Hadoop

Thursday, September 18, 2014 - 2:15pm

It’s a common refrain here at VoltDB: Big Data is created by Fast Data.


VoltDB is able to ingest and transact on data at phenomenal rates, and as a result, VoltDB often finds itself at the front end of fast data pipeline applications. VoltDB’s role in fast data pipelines is to ingest and process “hot” data, perhaps all of the data created today or this week, for real-time analytics and decisioning.

 

As the data ages and becomes historical, the cold data can be written transactionally to VoltDB Export Tables. Export Table Connectors automatically move data out of VoltDB to long term storage, the “Data Lake”, completing the data pipeline. VoltDB v4.7, our September 2014 release, introduces a new connector to Hadoop. This new Export Connector receives the serialized data from the Export tables and writes it out to Hadoop via HTTP requests to WebHDFS.


If the final resting point for your data is not Hadoop/HDFS, VoltDB also provides Export Connectors to flat CSV files, to JDBC data sources, Vertica, Netezza, and message queues such as Kafka and RabbitMQ. We’ve also published tips on how to create your own custom exporter, found here: http://voltdb.com/creating-custom-export-connector-runs-inside-voltdb. If your application requires specific integration we don’t provide, please let us know at info@voltdb.com.