Introducing VoltDB.NET: C# Library for Your VoltDB Applications
Written by Seb Coursol
I’m happy to introduce our first .NET C# library for VoltDB, which is available for download (full source with samples & documentation) at: https://github.com/VoltDB/voltdb-client-csharp (Latest build-only with documentation from the downloads page).
Today I would like to go over the basics of how you can leverage the library to create a basic client application to connect to your VoltDB server…
There will be more posts in the future, on general best practices, optimized asynchronous application designs or detailed looks at specific features – here I will simply go over the basics to wet your appetite and get you started!
Basic Building Blocks of a VoltDB.NET Application
As you know a VoltDB database isn’t your standard run-of-the-mill database: it is a high-performance system built and bred for a single purpose: record-breaking transactional performance. This means you don’t operate with it the same way you would a “normal” DBMS. You won’t see Datasets, DataReaders or DataContexts here: though a lot of it would probably be feasible, the mere overhead of the implementation would defeat the purpose of consuming a high-performance engine! But you still get a lightweight, simple and LINQ-friendly library that makes the job of integrating VoltDB in your .NET application framework a breeze.
As you know, access to a VoltDB database is made exclusively through stored-procedures, and those procedures have a standard return type: a list of Tables. Thus, interaction with a VoltDB system is fairly simple: you connect to the server, call procedures and retrieve tabular results.
Owning to that fact, there are 3 main components you will be working with in your C# applications:
- VoltConnection - Your connection to the VoltDB node/cluster.
- ProcedureWrappers - Strongly-typed wrappers to access your stored procedures.
- Tables - Your access to the results returned by procedure calls.
The rest is all “fluff” added on top to simplify your life.
Learn by example: Hello World
Let’s look at the most basic and classic sample in the book. We’ll go line by line, but first let’s look at the overall code. As I detail each section, I’ll skip on all the exception handling details to focus on the library-specific features.
Simple enough, right? Let’s get started.
Adding the Library to Your Application
The first step is obviously to add the assembly as a reference to your application. I strongly suggest you lug along the .xml file with the .dll file (as in: just keep both files in the same folder, wherever you want on your machine). This will allow Visual Studio to pickup the documentation data and provide you with it in Intellisense – it saves a lot of time!
After that, simply add the assembly to your “using” list; there is only one namespace available to you, so you won’t get lost:
Creating a Database Connection
You create a connection using the static factory method Create on the VoltConnection class. You can pass in a connection string, or a ConnectionSettings object (both are interchangeable; there is actually an implicit conversion operator between ConnectionSettings and string).
The ConnectionSettings class has the advantage to be more explicit (and does provide useful Intellisense support), however, once your application is developed, you will usually find that using a connection string is much more flexible: you can store it in your application’s app.config or Settings file and change it whenever without ever having to recompile the code.
We will discuss the different connection settings and how to best chose them in another article – in general, the defaults (what you get if you don’t specify them in your connection string too) are adequate in most situations. For today, we will stick with a basic connection string and the most important keyword in it: hosts
This is how you indicate which servers to connect to. You can put a single server name or IP address, or a comma or space separated list of names/IP addresses. When you list multiple VoltDB node, the connection object will automatically load balance traffic through each connection, as well as do its best effort to maintain connectivity. If you have 2 nodes and the connection to one node is lost, the client will keep operating one the first node while attempting, in the background to reconnect to the second node. Coupled with the load balancing, this is a great way to ensure your client stays connected at all time.
Note: Obviously, if one of the connections is lost, some pending transactions will be in a transient state (the server received them, but you did not get the response), which is never a pleasant situation to deal with. In such a case, the connection client will trigger your custom callback, passing along a VoltClientAbortException so you can at least act accordingly.
Once you have your connection object, you need to open it. Because the method returns the instance on which you run it, you can do this in a single line of code while still performing the variable assignment, as you can see in the sample. Closing the connection is important to make sure you disconnect the underlying socket to the VoltDB servers. Because the connection is disposable, you can ensure you will always do this through the using statement. Your specific use case will dictate whether this approach is adequate or not: opening a connection is always expensive and, often, a service architecture where client threads access a shared connection object will be more appropriate. This would be the case if you wanted to use a VoltDB database to support an ASP.NET website, for instance.
We will get back to to subject of closing the connection later. Let’s jump ahead to the next step in our application.
Define Procedure Wrappers
A design time, the only person who knows the name and parameters of your procedures is you. With a traditional DBMS, you would use a data context to code-generate all the wrappers for you, but this isn’t something we have added to the client library yet. This left us with two options:
- Having a generic call method where you would pass the name of the procedure (as a string) and a list of objects as parameters
- Letting you define, with minimal work, strongly-typed wrappers that act just like the perfectly auto-generated wrappers you get from a DataContext
Obviously, the first method is very error-prone: a typo in the name of the procedure or an accidental swap/type violation in the arguments you pass and unpredictable results are on their way! So we came up with ProcedureWrappers and a lot of auto-generated code to avoid this. Funny enough, beneath the scenes, the connection object actually uses method #1, but from your end, this is all invisible and you are able to build yourself a type-safe, developer-friendly facade to code with.
All it takes is one line of code for each procedure:
On the VoltConnection instance you created, the Procedures property give you access to a (massive) list of generic methods. In truth, it all appears like one single method with 2 signatures: the Wrap method. As parameters, you pass it the name of your stored procedure – this should be the only place in your application you will have to type that unsafe string (so no risk of typos anywhere after that); optionally, if you intend to do asynchronous processing and process the results in a separate method, you can pass a call back method.
The last piece of the puzzle is the list of type parameters you provide to the method. For this, there are 21 overloads, allowing you to defined wrappers for procedures taking from 0 to 20 parameters.
The very first type parameter is the return type of the procedure, which can be any of the following:
This is the default return type for VoltDB procedures: an array of Table objects, each possibly with a different schema
If you know the procedure will only return a single table, save yourself form having to access it through an indexer as Result and simply wrap your procedure accordingly. Beware: if the procedure returns more than 1 table, the deserialization process will throw an exception to indicate you have wrongly surmissed the output of your request!
There are cases where you know the procedure will only send you a table with 1 single row. Defining the procedure adequately, will yield a more lightweight object (where you don’t have to iterate through a Row collection when it isn’t relevant!). Again, beware: the de-serialization process will throw an exception if the table has more than 1 row!
A bit of a corner case, but perfect for when you expect multiple single-row tables to be returned to you.
T, where T: byte, int, etc.
As a similar approach to SingleRowTables, there are cases where you expect 1 single-column table to be returned to you. In this case, you also know that the data type of every element will obviously be the same. So instead of getting a bulky Table object, you can ask for an immediate conversion to an array of value-type elements. For instance, if the VoltDB data type of the column is BIGINT, you could type the result as: long?.
T[ ][ ]
As always, you might be given multiple tables. In this corner case, you would expect more than 1 table, and every table to be a single-column table of the very same type.
There are cases where the procedure returns a single value (such as an application-specific status or an “affected rows” count). As always, you can save the bulk of receiving a Table object and simply indicate the value type you are expecting to receive.
The Null type is useful both when the procedure returns nothing (and empty array of Tables) or when you effectively want to discard the object, which can be useful for cases where you want to make sure the request completed (so you assign a callback to your wrapper), but don’t intend to do anything with whatever result comes back, other than notifying a third party “the request was completed”.
All the other type parameters of your call (from 0 to 20) defined the type of your procedure’s parameters themselves, any of the supported value-types (short, int, long, etc.). For more details on type support, see the reference tables at the end of the article.
Calling Your Procedures
Armed with our wrappers, posting requests to your VoltDB server is now a single, type-safe, call away:
Because you have now defined strongly-typed wrappers, the compiler will know exactly what to expect when you write such code, and alert you about any type-casting issue. This is how easy that was!
Additional overloads are available on the Execute method to allow you to set a timeout for the request, after which you will move on in your code without waiting further. A DefaultCommandTimeout is available in the connection settings so you don’t necessarily always have to pass that value.,/p>
Note: The calls above are synchronous, which means each call will block until the response is received from the server (or times out based on the connection setting) – this is rarely ever something you will want to do since it limits your application throughput. For Asynchronous operations you can use the BeginExecute methods on your wrapper. The default signatures (with or without timeout) are the same as for the Execute method, with additional overloads allowing you to pass along a state object for your callback to receive. A sample asynchronous operation might look like this, for instance:
A further discussion on Asynchronous operations and best practices is definitely outside of the scope of this short introduction, however, if you have ever programmed with the standard .NET Async Pattern (.BeginOperation, .EndOperation, .CancelAsyncOperation and the intermediary IAsyncResult handle), then the VoltDB.NET library asynchronous calls will be familiar to you. The only notable difference is that, instead of receiving an anonymous IAsyncResult, your callbacks will receive a strongly-typed AsynResponseobject (where TResult is the type of your procedure’s return result: Table, etc.), saving you a clumsy casting operation.
Have a look at the Voter sample in the SDK source for a demonstration of Asynchronous calls with the library. Additionally, for more details on general concepts of Asynchronous programming, check out MSDN or this great and simple article on the basics at CSharp-online.
It is important to keep in mind that, only an Asynchronous-based client implementation will fully leverage the potential of your VoltDB cluster in a single client, easily exceeding tens of thousands of calls per second, even on a single-node VoltDB database, when synchronous calls will lag in the low hundreds! Beneath the hood, all calls in the library are actually asynchronous calls, for optimal performance, and synchronous methods are primarily offered for convenience.
Processing Your Results
Once your call has completed, it is time to process your results. Because the server response includes more than the mere result, all calls receive an actual Response object. Synchronous calls receive a Response, while asynchronous calls receive an AsyncResponse (where TResult is the type of your procedure’s return result: Table, etc.). As part of the IAsyncResult implementation, the AsyncResponse object gives you access to your optional state object passed along when you made the asynchronous call with .BeginExecute. That state object could, for instance, a transaction id created on the client side that you need to match to when receiving the response to notify a specific client, or any object needed for your post-processing.
The Key property on the response object, of course, is .Result: the actual result of your request. Be warned that, if there was any type of error during the request execution, getting this value will throw an exception containing the details of that error, such as a constraint violation on the server, a client-side timeout or any condition that may cause the request to fail. You can prevent this either by checking the response’s .Status property or using the .TryGetResult(out TResult result) method. Either approach will allow you to safely code without wrapping everything into a try/catch block.
We already saw one example of retrieving the response result:
One this call, we merely checked the convenience property .HasData on Table and SingleRowTable, equivalent to checking for .RowCount > 0, as a way for us to detect whether initialization of the HelloWorld sample catalog had already run, so we would avoid the inevitable “Constraint violation” we would receive trying to re-insert a record that was already there.
Let us now look at a case where we want to work with the result:
Because Table and SingleRowTable objects aren’t strongly-typed (you don’t know what data type each column is), working with them is sometimes a bit cumbersome as you have to call methods like .GetValue<int?>(1,1) (here to retrieve an INTEGER field – which can be nullable, thus usage of int? – at row #2, in column #2 (0-based indices)). As I write this I cannot even remember myself whether rowIndex or columnIndex comes first, and while Intellisense would tell you and other methods are also available to prevent that confusion, having to use typed Generic methods is little better than calling something like .GetNullableIntValue(1,1), which is as cumbersome on the eyes as it is error-prone.
To save you the trouble and risk, Table types expose a .Wrap method very similar to that we already saw for ProcedureWrappers, letting you perform a one-time type definition of your result and then move on to using a more lightweight object on which all properties are now strongly typed. This is what you can see on line #35 where the Wrapper exposes the properties Column1 and Column2, both of type string.
Note: All data types in VoltDB are nullable, and while for the specific catalog you operate in this might not apply, this is unfortunately not information the procedure makes available to us. As a consequence, internally, the Table store all values as Nullable types, thus, you have to cast your wrapper using those types: byte?, short?, etc. See the last part on data casting for more details.
Closing the Connection
Closing your connection is, at first glance, as simple as saying “close the door when you leave.” But things get more problematic in Asynchronous scenarios or on service-based architectures sharing a central connection object. Namely you want to make sure all queries have been completed (and callbacks triggered), before you close the socket, effectively refusing to receive any more information and results from the server. This is what the connection’s .Drain() function will do for you, which is also the default behavior when you call .Close() or otherwise Dispose of the connection as, in our sample, when utilizing a using statement.
In the case of a service architecture, because multiple clients start up, grab the shared connection, issue their requests then shut down, neither closing nor draining the connection will work (Drain applies to all pending requests – not just those of the client wishing to leave). While this is a proposed upgrade for the library where statically-controlled connections would offer the option to Drain entirely or only requests posted by the calling thread, you will – for now – have to resort to more cumbersome and traditional methods: tracking yourself the IAsyncResults of your requests until all have completed before you let the client application terminate itself (while leaving the connection itself untouched!).
Acceptable Type Conversions Between VoltDB and .NET
There would be a long discussion as to why specific types work in one place and not another, but it is simpler to just list out everything and point out the caveats when applicable:
|VoltDB Type||(SingleRow)Table.Wrap() and .GetValue(…)||Single-Column/Value Results (T[ ][ ], T[ ] and T)1,3||Procedure Parameters2|
- Null values in VoltDB are encoded with a special at-boundary constant for each data type, for instance, for TINYINT (natively corresponding to sbyte?/sbyte in .NET), ‘null’ is represented by -128. Using standard value types on a schema that allows ‘null’ values can be dangerous and lead to data corruption. However, extension methods have been provided should you wish to perform your own manual “null-check”. All core value types have an additional .IsVoltDBNull() method that will return ‘true’ when the value is equal to VoltDB’s internal representation of ‘null’ for this data type.
- Procedures also accept arrays as input, such as string, long, etc. – when the server-side procedure is so defined.
- The connection validates that you request a matching type-cast. Thus asking for a long? if the underlying data type if a TINYINT (thus a byte?) will cause a run-time exception.
- 2011-03-13: Support for the decimal data type has been added with v126.96.36.199 of the .NET client library. This is a specific decimal data type defined as a fixed-scale DECIMAL(38,12), not to be confused with the .NET data type that corresponds to DECIMAL(28,[0,28]). Our thanks go to Michael Giagnocavo for his contribution!
You should spend some time to review the Voter application sample in the SDK to familiarize yourself with Asynchronous operations. Once you’ve gone over it, you should know enough to build your own applications. Have a look at the Compiled HTML Help file that ships with the SDK and will provide you many details not even mentioned here.
Our next article will explore some of the finer points of configuring your database connections for optimal results based on your application payload
Where to Download?
Get the full source code through GitHub at: https://github.com/VoltDB/voltdb-client-csharp
Or get the latest build with .CHM documentation and Intellisense support .XML documentation from the downloads page: http://learn.voltdb.com/DLSoftwareDownload.html