VoltDB.NET: Synchronous vs. Asynchronous Request Processing

written by Seb Coursol on March 21, 2011 with 2 comments

Deciding whether to handle your workload as a sequential set of synchronous operations or an asynchronous batch isn’t just a .NET topic, but a larger design decision with any VoltDB application. This week we review the pros and cons of each choice in a general context and then as it pertains specifically to your .NET implementations.

By now, you probably have a good idea of how VoltDB works: each partition is its own processing engine, and while, within a partition, work is performed sequentially, each partition is essentially autonomous so that all perform work in parallel.

As we discuss below, per-partition throughput can easily translate to 100,000 TPS on modern server hardware, but we will keep the math simple in our following description.

Suppose your average transaction runs for – say – 2ms, each partition then has a throughput of 500 transaction per second, and a system with 2 partitions will have a throughput of 1,000 transactions per second. Things get slightly more convoluted when you introduce redundancy (K-Safety), but essentially the math remains the same: your throughput linearly increases with the number of partitions.

In addition to the execution time of the transaction itself, you, as a client application, must also account for the “transit time”: the time it takes for your request to reach the server, then for the response to come back to you, and the overhead of serializing and de-serializing each. For the sake of the example, let’s say that this overhead accounts for an additional 3ms.

So now, if you run transactions one after another, each operation takes approximately 5ms, and your throughput is reduced to around 200 transactions per second. A performance hardly anyone will be impressed with. Now what if, instead of waiting for the completion of your first request to send a second one, you just keep on pushing requests one after the other for the execution engine to process and collect your responses later when the engine returns them to you. Even on a single partition, you will get some performance improvement… The math is simple: it takes you 1.5ms to send a request to the server, so your second request arrives at the execution engine 0.5ms before the first request is completed, and starts executing before even the server starts sending you the response to that first request. If you keep pushing requests fast enough, there is always a queue of pending executions ready for the server to process, which means that engine performs at maximum capacity: 500 transactions per second. Similarly, because, while there are some delays, that type of volume is hardly going to cause network strain, you will also receive responses at a rate of 500 per second.

The real difference happens when you have more than 1 partition in your VoltDB database: synchronous operations get absolutely no benefit from the engine’s parallelism because you still only push one request at a time, and while one of the partitions will be busy executing that request, the other one will sit idle: a waste of resources!

With asynchronous operations, things change entirely: with a balanced load that pushes an equal amount of work to each partition, you find yourself tapping in an engine that gives you twice the capacity (supposing a 2-partition database). Suppose the first request goes to partition 1, 0.5ms before it completes, the second request, bound for partition 2 arrives and immediately starts executing, 1ms second after the first request completed, a third request arrives and starts executing immediately in partition 1 that was idle; just as request 2 completes, request 4 arrives, is bound to partition 2 and therefore starts executing immediately, etc.: as you push request fast enough, both partitions can work at full capacity, giving you twice the throughput.

The following 2 diagrams show the difference: on the left, the synchronous model where only one partition at a time is active and new requests are pushed upon completion (and reception) or the previous request; on the right, the asynchronous model where requests are pushed in parallel as fast as possible to the server, to be executed in parallel there (and the responses processed, later, in parallel as well).

VoltDB Synchronous vs Asynchronous Execution

From those diagrams, you can also understand why you want to optimize your workload for partitioned work, where you benefit from the engine’s by-design parallelism: multi-partition transactions, while sometimes necessary, will quickly choke your system: even if you use an asynchronous implementation, a 2ms multi-partition transaction would limit your throughput to 500 such transactions per second: you are essentially reducing your VoltDB database to a single-partition engine!

If we get back to the real world for a moment, your bias is even stronger, because per-partition throughput far exceed 500 TPS – for modern hardware you’d easily see a 5-figure number there: the bulk of the time wasted is in the network latency, so that performing this de-coupling of operations becomes key to your application design and can mean a night & day difference in your performance results.

Asynchronous or Synchronous – How Do You Choose?

The asynchronous model is often “advertised” as beneficial through the following pseudo-code snippet:

Start Long Asynchronous Operation

Do Something Else
. . .
Do Something Else

Wait For Long Asynchronous Operation
Get Result From Long Asynchronous Operation
Process Result

And the justification is that, instead of waiting idly for that long operation to complete you can do other useful work. In many practical cases however you often find yourself implementing:

Start Long Asynchronous Operation

Do nothing!

Wait For Long Asynchronous Operation
Get Result From Long Asynchronous Operation
Process Result

. . . Simply because really you don’t have much, or anything to do while you wait and simply need that response to provide feedback to the user! Think for instance about a login operation, or executing any type of transaction for which you absolutely need the response from the server before you can (or truly need to), continue further: this is most often the case when working on websites or mobile applications, unless you designed a smart user interface that can operate asynchronously.

With VoltDB however, there is another strong advantage in using asynchronous operations: you get to squeeze more performance from the server!

The choice of going asynchronous therefore truly depends on how strongly “coupled” your payload operations are: ideally you can “fire” requests and don’t need to provide client feedback at all (or in any kind of highly sensitive time constraint). You will be in that type of situation when processing data streams or any type of transaction where a response delay of, say half a second doesn’t bother you, and most importantly, the execution of one request does not depend on the result of a prior request. For instance, think of a system that would analyze a twitter stream to perform a keyword analysis: your program parses the tweets and aggregates keyword counts for each tweet and posts a series of transactions to the server to increase the total keyword counts: each request is entirely independent from one another and can be performed asynchronously. Your only “heavy” and synchronous transaction will be an occasional multi-partition aggregate to return say the top 100 keywords to a user.

A login operation isn’t as lucky: unfortunately, you do need the result of the login transaction before you can move forward and accept (or reject) the user’s login details. However, nothing says you cannot have multiple threads pushing those login requests to the same database connection (or to multiple database connections). So while your login operations are performed synchronously, all hope isn’t lost and you can still leverage VoltDB’s massive internal parallelism. A session data store is another such example where you do need synchronous access to your session data for a single user, however, viewed from afar, when multiple users interact with your website, things work essentially in parallel and still benefit from the full performance of the engine. In our next study we will re-implement the traditional session state data store on VoltDB.

Making Synchronous Calls

Synchronous calls are best demonstrated in our “Hello World” sample application, that we reviewed in our first post. After creating your procedure wrapper, you simply call the .Execute method to push the request to the server, and wait for the response that is returned directly from the method call:

var Select = conn.Procedures.Wrap<SingleRowTable, string>("Select");
var response = Select.Execute("English");

Obviously, because of their linear/”causal” model, synchronous calls are very easy to understand and work with: you want something, request it, and get your answer right there and then.

But again, they come with strong limitations in terms of performance.

Making Asynchronous Calls

Some good examples of asynchronous workloads are provided in our Voter and KeyValue samples. Here, we will cover the basics.

Respecting the .NET Asynchronous design pattern, the VoltDB.NET library comes with the complete array of asynchronous operations around the IAsyncResult object. You most basic asynchronous operation would resemble the following:

var Select = conn.Procedures.Wrap<SingleRowTable, string>("Select");
IAsyncResult handle = Select.BeginExecute("English");

// Do something

var response = Select.EndExecute(handle);

If we look back at the “Hello World” example, you will see we perform all our insert operations sequentially:
// Define the procedure wrappers we will use

var Insert = conn.Procedures.Wrap<Null,string,string,string>("Insert");
var Select = conn.Procedures.Wrap<SingleRowTable, string>("Select");

// Initialize the database.

if (!Select.Execute("English").Result.HasData)
{
Insert.Execute("Hello", "World", "English");
Insert.Execute("Bonjour", "Monde", "French");
Insert.Execute("Hola", "Mundo", "Spanish");
Insert.Execute("Hej", "Verden", "Danish");
Insert.Execute("Ciao", "Mondo", "Italian");
}

Now technically, while all those operations depend on the pre-validation execution of Select (ensuring we do not re-run the initialization, causing primary key violations), the inserts themselves aren’t related, so that we could execute them all in parallel, asynchronously. This is one way this could be accomplished:

string[][] data = new string[][] {
new string[] { "Hello", "World", "English" }
, new string[] { "Bonjour", "Monde", "French" }
, new string[] { "Hola", "Mundo", "Spanish" }
, new string[] { "Hej", "Verden", "Danish" }
, new string[] { "Ciao", "Mondo", "Italian" }
};
WaitHandle[] handles = new WaitHandle[5];
for(int i = 0; i < 5; i++)
{
handles[i] = Insert.BeginExecute(data[i][0], data[i][1], data[i][2]).AsyncWaitHandle;
}
WaitHandle.WaitAll(handles);

Since we don’t care about the server response we merely need to wait for all 5 requests to complete. If you haven’t used the IAsyncResult object before, we strongly recommend you read the .NET documentation about it: multiple design patterns are available to wait for completion of your asynchronous requests (and among other things, while WaitHandles are easy to work with, they do come with some serious limitations and negative performance constraints). Here we will focus on practical common applications when using VoltDB.

In a more generic case, however, you will likely want to do something with the server response. This is when using a callback method becomes handy. For instance, in the Voter sample, we perform the following operations:

static void VoterCallback(Response<int> response)
{
if (response.Status == ResponseStatus.Success)
Interlocked.Increment(ref VoteResults[response.Result]);
else
Interlocked.Increment(ref VoteResults[3]);
}
static void Main(string[] args)
{

var connection = . . .
. . .

var Vote = voltDB.Procedures.Wrap<int, long, sbyte, int>("Vote", VoterCallback);

. . .

Vote.BeginExecute(phoneNumber, contestantId, maximumAllowedVotesPerPhoneNumber);

. . .

connection.Drain()

.  . .
}

The callback is provided as part of the procedure wrapper definition. After than, you only need to call BeginExecute on the wrapper and your callback will automatically be called when the response is received from the server. Note that usage of a callback and .EndExecute is mutually exclusive and that calls to .Execute will not trigger the callback.

In that specific case, and because we post thousands of transactions to the server, we use a “global” wait operation on the connection itself by “draining” it of all work before continuing (pulling up the results and closing the connection). This is much easier (and faster) than using the WaitHandles, but has a catch: you must own that connection entirely: should other threads share the connection and submit work to it, you would effectively wait for all threads to have no more pending work – this is hardly ever going to be what you want to do as, on an active multi-threaded system, you could wait forever!

If you share your database connection with multiple other threads (for instance within the context of a web service backing a website or mobile application), you will want to use one of the IAsyncResult wait patterns, or a “trick” specific to VoltDB. Suppose you have a 2-partition database on a single server to which you post a balanced payload of 1,000 request, half for partition 1, half for partition 2: while you are not guaranteed the execution order across partitions (that is you are not guaranteed a request A sent to partition 1 followed by a request B sent to partition 2 would execute in that order: if partition 1 is loaded, request B might execute first), you are more or less guaranteed a certain chain of events. Most importantly, because a multi-partition transaction needs to gain access to the entire cluster, you know that if, after you post your 1,000 single-partition transactions you post a single multi-partition transaction, that transaction will execute last and only after all single-partition transactions have completed.

So one way to wait for completion of your operations is to run a “dummy” multi-partition transaction after you have posted all your single-partition work load. This is a trick however and there is one catch: if you have a multi-node connection, transaction ordering becomes more convoluted: essentially, whichever transaction arrives first on any of the node will execute first. So conceivably, if some of your nodes are lagging on the network, you could be sending that multi-partition transaction before some of the single-partition transactions have ben received. If you are not extremely sensitive on timing, one possible workaround is to make your main thread sleep for half a second or more before posting the multi-partition transaction, which should ensure it arrives last. In a later implementation of the library, we intend to provide a reliable mechanism to allow you to wait on thread specific payload completion without having to resort to such tricks or the more cumbersome wait handles.

Words of Advice

The strength of VoltDB is the combination of it’s lighting-speed in-memory processing with its massive by-design parallelism: a simple multi-core modern workstation will have no trouble screaming through the Voter benchmark application at rates nearing 100,000 transactions per second. But to benefit from that type of performance you will have to do you best to leverage asynchronous operations, which means non-coupled requests. When you have an absolute need for request sequencing, look for multi-threaded implementations: all operations on the VoltConnection are thread safe, so there is nothing wrong with having 60 threads pushing requests into a globally shared connection (or a pool of shared connections). This is the type of implementation you will want to use when using VoltDB to back a website or mobile application as it will give you the benefit of simple synchronous operations within each execution thread, with the performance advantage inherent to the engine core.

What’s Next?

Nothing beats a good example for a practical real-world application: one of the most basic needs of a modern website is session management and caching support. ASP.NET comes with various flavors, from an in-memory data store adequate for single-server deployments but unfortunately unadapted to web farms, to a separate out-of-process data store that can be hosted on a separate server, or a SQL Server implementation, that last one providing a lot of advantages but dreary performance. More recently, Microsoft came up with its distributed caching solution called AppFabric also deployed on its Azure cloud platform, but for many of us out there, Memcached implementations have some appeal. In my recent experience, I was quite satisfied: Memcached as a session store worked quite well, however as I expanded my implementation into a more generic caching layer, I ran into all sorts of synchronization issues between my multiple servers, coupled with the lack of thread-safety in some of the core methods of the provider I had found: I found I had just invested in a clumsy out-of-process in-memory hastable I couldn’t easily scale as time went on! Those are all problems a VoltDB solution will not have since clustering is part of the core design, leaving us with the door wide open on a very attractive implementation for a scalable caching layer and session store solution.

I have already started work on this but it will likely be a couple weeks before the full solution is available for download. For those of you interested by the challenge, you might want to read up on the ASP.NET 2.0 Providers (don’t let the version number or dates fool you: nothing new has happened in that area since then).

Where to Download?

Get the full source code through SVN at:
http://svnmirror.voltdb.com/clientapi/csharp/.
Or get the latest build with .CHM documentation and Intellisense support .XML documentation from the downloads page:
http://community.voltdb.com//downloads

Seb Coursol
Sr. Technical Consultant
VoltDB