WOW!
Unless you’ve been having internet connection problems (or in case you are not tuned to Microsoft blogs) you probably heard about the new C# “await” keyword.The WOW factor here is that it allows LEGACY CODE to be re-written asynchronously. Here’s a short example:
Blocking code (legacy code):
Integer foo() { return webservice.read() + 1; }
Non blocking code (C# 5):
async Task<Integer> foo() { return await webservice.readAsync() + 1; }
That’s it. You can re-write your entire legacy C# application within less than a day into a Node.JS like single thread thingy. All of the state machine boiler plate code is generated by the C# compiler. As long as the web service network operation is in progress, the thread can run other code elsewhere. When the webservice returns, and the thread is not occupied the “incremented by 1” is performed and the result is returned to the caller of foo(). No context switching overhead, no clumsy callbacks, no IDLE CPU cycles.. simple!
For those of you who want to get a deep dive read all about it in Eric Lippert’s blog
Oh –no! Idempotent
My second gut reaction was to think about fail-over. GigaSpaces users are used to save their state objects in the Space. This ensures that the continuation will actually continue if the process dies unexpectedly. How? The state is replicated to the backup on another machine, and if the primary fails unexpectedly, the GigaSpaces backup will pick-up from where the Primary left. So let’s fantasize that we have a .NET application using the “await” keyword, and that it stores the continoution state (meaning which line of code to execute when the webservice responds) in GigaSpace. GigaSpaces replicates the continoution state to the backup after starting the web service request (but before the request completes). Even if the state replication to the backup is done on the same thread, it’s going to return quickly as the replication SLA is tightly controlled by GigaSpaces. If you don’t want to pay this small price, GigaSpaces can backup asynchronously with reduced consistency garantees. Now what would happen if the Primary fails, and a fail-over occurs (after we called the webservice and before it returned the result) ? The backup will pick-up where the Primary left, and will call the webservice again. That means the webservice will be called twice! If the web-service only reads data then it is Idempotent (meaning you can call the web-service once or twice or N times, the result would be the same). But the web-service may have a side effect (such as starting a new virtual machine). That means our application will start two machines instead of one.
What I’m trying to get at, is that in order for this async flow continuation to work, all remote web service calls need to be Idempotent.The Amazon EC2 web service took 3 years to get it right.
Transactions!!
So you are designing a scalable business logic application and you plan to store your state on SQL Server (or MySql). This requires a remote call which requires using async operations. An async operation requires the SQL statements to be idempotent, since a fail-over in the continuation may invoke it twice. Idempotent SQL means transactions (atomic read and then write). Transactions don’t scale well, unless you use stored procedures.
Instead of running the business logic on one machine, and storing the business data on another machine, stored procedures are hosted inside the database. Let’s say the transaction performs 2 read operations and 2 write operations. With remote data access, we need to perform 4 async I/O network operations. Stored procedures, on the other hand, perform 4 operations in memory … and only when the transaction is committed a disk I/O operation occurs.
But writing all of the business logic code as stored procedures is not very maintainable (the Complex Event Processing guys claim that Java/C# code is also not maintainable, but I will gracefully ignore this :). And there is the database performance issue at scale. Without SSD caching, committing a transaction requires moving the disk head which causes a performance contention point between different threads and is limited by the hardrive throughput. Scaling transactions requires data partitioning and you end up designing your own data grid.
An in-memory-data-grid allows developers to write business logic in Java or C# but still runs it in the same process as the data. The data grid also runs all 4 operations in memory and performs 1 network I/O operation to the backup machine (which unlike writing to disk, is not a thread contention point). The resulting data access code is completely synchronous and since it is Java/C# based it is easy to read. The transaction is just a method attribute and the network operation occurs after the method ends.
To conclude, when it comes to data services try to avoid the async keyword. I believe the right choice is to run the business logic co-located with the business data and use synchronous data APIs. When you do have to perform an async web service API call, make sure it’s idempotent, otherwise you run the risk of executing it twice.
Itai