NetShock Blog
Friday, May 22, 2009

Distributed transactions and SOA

A fairly debated issue is that of distributed transactions with SOA.  One of the major challenges with SOA is data consistency and integrity.  Obviously, having clearly defined service boundaries disrupt the idea of foreign key constraints and overall database designs.  So how then do we make sure our data integrity remains in tact when executing complex, cross-service operations?

In traditional development, any developer would tell you that the best way to sustain data integrity for complex operations is to use transactions.  By wrapping all operations in a transaction, if any operation fails, the entire transaction can be rolled back.  A developer need not worry about half finished orchestrations or manual rollbacks.

With SOA, however, we are determined to have services loosely coupled.  Thus, having a distributed transaction spanning multiple services is considered to break those boundaries and tightly couple those services.  Purists of SOA will strongly oppose the idea of distributed transactions for this very reason - communication between services should be restricted to structured messages - nothing else.

Of course, that doesn't solve our original problem of data integrity.

I spoke with one of the senior developers of Microsoft's Azure team about this issue, considering Azure, at the time, had no distributed transaction functionality.  Microsoft are pro-orchestration for service communication, as in they would prefer to create a workflow (or orchestration) of how services should be called to complete an operation as opposed to services communicating directly with each other (by means of events).  His suggestion was to implement failover logic into the orchestartion to manually rollback the data changes.

I have a couple of issues with this approach.  The first being fairly obvious - the development time to write manual rollback logic for every cross-service operation would be significant.  Secondly, it still leaves the door open for read operations to occur half way through the orchestration when only half the services have been notified.  A transaction, on the other hand, would lock these resources until the entire operation had completed.

Juval Lowry (http://www.idesgn.net), a well-respected WCF guru who runs training courses for WCF (among other things), also commented on this issue.  His sentiment was that it is significantly easier for a developer to use distributed transactions, and they should do so wherever applicable.  He acknowledges the possible boundaries this breaks; however, he asserts that the alternative is really not plausible.  Of course, he also acknowledges there are times when distributed transactions should not be used.

If your services are separated by significant distance (geographically) or boundaries (i.e. domains) then distributed transactions may not physically be an option, or may simply not be the best option.  In this case, there are a number of other options from which you can choose.

If you wish to architect purely SOA, distributed transactions may not sit well with you, but if you're looking to build a solid architecture for a real business, distributed transactions seem like the only option.

At NetShock, we've developed some fairly simple guidelines for best practice when it comes to distributed transactions.  These factor in a number of variables to really decide which way will work best for the business, not an SOA purist.

Comments