![]() | ![]() | contents |
The export-to-file client fetches the serialized data from the export connector and writes it out as text files (either comma or tab separated) to disk. For best performance and ease of use, you should run the export-to-file client on the database server(s) as described in Section 13.6.1, “Running the Export Client on the Database Server”.
However, it is possible to run the client as a remote process. Section 13.8.2, “The Export-to-File Client Command Line” describes the command line to run the client remotely.
When the export-to-file client receives export data, it de-serializes the content and writes it out to disk, one file per database table, "rolling" over to new files periodically. The filenames of the exported data are constructed from:
A unique prefix (specified with --nonce)
A unique value identifying the current version of the database catalog
The table name
A timestamp identifying when the file was started
While the file is being written, the file name also contains the prefix "active-". Once the file is complete and a new file started, the "active-" prefix is removed. Therefore, any export files without the prefix are complete and can be copied, moved, deleted, or post-processed as desired.
There are two main options when running the export-to-file client:
The --type option lets you choose between comma-separated files (csv) or tab-delimited files
(tsv).
The --batched option tells the export client to group all of the files for one time period
into a subfolder, rather than have all of the files in a single directory. In this case, when the export client
"rolls" the files, it creates a new subfolder and it is the folder rather than the files that has the "active-" prefix
appended to it.
Whatever options you choose, the order and representation of the content within the output files is the same. The export client writes a separate line of data for every INSERT it receives, including the following information:
Six columns of metadata generated by the export connector. This information includes a transaction ID, a timestamp, a sequence number, the site and partition IDs, as well as an integer indicating the query type.
The remaining columns are the columns of the database table, in the same order as they are listed in the database definition (DDL) file.
If you choose to run the export-to-file client remotely, VoltDB includes a shell command that lets you specify export properties, similar to those you can specify for server-based export. In its simplest form, the command looks something like the following:
$ exporttofile \
--connect client \
--servers myserver \
--nonce ExportData \
--type csv The complete syntax of the command line is as follows:
$ exporttofile {arguments...}
$ exporttofile --help
The supported arguments are:
A comma separated list of host names or IP addresses to query.
The prefix to use for the files that the client creates. The client creates a separate file for every table
that is exported, constructing a file name that includes a transaction ID, the nonce, the name of the table, a
timestamp, and a file type specified by the --type argument.
The port to connect to. You specify the type of port (client or admin), not the port number.
The type of files to create. You can specify either csv (for comma-separated files) or tsv (for tab-delimited files).
The username to use for authenticating to the VoltDB server(s). Required only if security is enabled for the database.
The password to use for authenticating to the VoltDB server(s). Required only if security is enabled for the database. If you specify a username but not a password, the export client prompts you for the password.
(Optional.) The directory where the output files are created. If you do not specify an output path, the client writes the output files to the current default directory.
(Optional.) The frequency, in minutes, for "rolling" the output file. The default frequency is 60 minutes.
(Optional.) Store the output files in subfolders that are "rolled" according to the frequency specified by
--period. The subfolders are named according to the nonce and the timestamp, with "active-"
prefixed to the subfolder currently being written.
(Optional.) Writes a JSON representation of each table's schema as part of the export. The primary output files of the export-to-file client contain the exported data in rows, but do not identify the datatype of each column. The JSON schema files can be used to ensure the appropriate datatype and precision is maintained if and when the output files are imported into another system.
(Optional.) The format to use when encoding VARBINARY data for output. Binary data is encoded in either BASE64 or hexadecimal format. The default is hexadecimal.
(Optional.) Alternate delimiter characters for the CSV output. The text string specifies four characters: the field delimiter, the enclosing character, the escape character, and the record delimiter. To use special or non-printing characters (including the space character) encode the character as an html entity. For example "<" for the "less than" symbol.
(Optional.) The format of the date used when constructing the output file names. You specify the date format as a Java SimpleDateFormat string. The default format is "yyyyMMddHHmmss".
(Optional.) The time zone to use when formatting the timestamp. Specify the time zone as a Java timezone identifier. The default is GMT.
(Optional.) Eliminates the six columns of VoltDB metadata (such as transaction ID and timestamp) from the
output. If you specify --skipinternals the output files contain only the exported table
data.
The Tao of VoltDB
The 5 Principles of VoltDB
VoltDB Technosphere
Products and Solutions
Technical Support
Key Features
Download VoltDB
No Limits
VoltDB Application Gallery
Infinite Possibilities
VoltBuilder Program
