The important point when planning to export data, is deciding:
What data to export
When to export the data
Whether to run the export client locally on the database servers or as a separate remote application
It is possible to export all of the data in a VoltDB database. You would do this by creating export-only replicas of all tables in the schema and writing to the export-only table whenever you insert into the normal table. However, this means the same number of transactions and volume of data that is being processed by VoltDB will be exported through the connector. There is a strong likelihood, given a high transaction volume, that the target database will not be able to keep up with the load VoltDB is handling. As a consequence you will usually want to be more selective about what data is exported when.
If you have an existing target database, the question of what data to export is likely decided for you (that is, you need to export the data matching the target's schema). If you are defining both your VoltDB database and your target at the same time, you will need to think about what information is needed "downstream" and create the appropriate export-only tables within VoltDB.
The second consideration is when to export the data. For tables that are not updated frequently, inserting the data to a complementary export-only table whenever data is inserted into the real table is the easiest and most practical approach. For tables that are updated frequently (hundreds or thousands of times a second) you should consider writing a copy of the data to an export-only table at an appropriate milestone.
The third consideration is where to run the export client. The recommended process is, where possible, to run the export client on the database server(s). Running the export client on the servers has two key advantages:
Export processing is distributed across the cluster, with each server exporting its own portion of the export workload. If you use a single remote export client, the client can become a bottleneck since it must process export data from all nodes of the cluster.
Export processing starts and stops automatically. You do not need to manage and monitor the export client as a separate process.
Using the flight reservation system as an example, one aspect of the workflow not addressed by the application described in Chapter 3, Designing Your VoltDB Application is the need to archive information about the flights after takeoff. Changes to reservations (additions and cancellations) are important in real time. However, once the flight takes off, all that needs to be recorded (for billing purposes, say) is what reservations were active at the time.
In other words, the archiving database needs information about the customers, the flights, and the final reservations. According to the workload in Table 3.1, “Example Application Workload”, the customer and flight tables change infrequently. So data can be inserted into the export-only tables at the same time as the "live" flight and reservation tables. (It is a good idea to give the export-only copy of the table a meaningful name so its purpose is clear. In this example we identify the export-only tables with the export_ prefix or, in the case of the reservation table which is not an exact copy, the _final suffix.)
The reservation table, on the other hand, is updated frequently. So rather than export all changes to a reservation to the export-only reservation table in real-time, a separate stored procedure is invoked when a flight takes off. This procedure copies the final reservation data to the export-only table and deletes the associated flight and reservation records from the VoltDB database. Figure 13.2, “Flight Schema with Export Table” shows the modified database schema with the added export-only tables, EXPORT_FLIGHT, EXPORT_CUSTOMER, and RESERVATION_FINAL.
This design adds a transaction to the VoltDB application, which is executed approximately once a second (when a flight takes off). However, it reduces the number of reservation transactions being exported from 1200 a second to less than 200 a second. These are the sorts of trade offs you need to consider when adding export functionality to your application.