Summary: The Unit of work Pattern - Parallel atomic ordered data processing for associated data objects Author: Shay Hassidim, Deputy CTO, GigaSpaces Recently tested with GigaSpaces version: XAP 8.0.0 Last Update: Feb 2011
GigaSpaces Unit of Work (UOW) enables a stand-alone message producer to group messages into a single unit such that those messages can be handled in order - similar to a FIFO queue localized within a transaction. This single unit is called a Unit-of-work and requires that all messages from that unit be processed sequentially in the order they were created (within the unit of work). Other units can be processes in parallel. This approach maximize the system performance and its scalability and allows it to processes vast amount of data consuming memory and CPU resources in a very optimal manner.
The UOW can be used with financial systems to process trade orders , in healthcare systems to processes patient medical data , with transportation systems to process reservations , with airlines systems to process flight schedule , with billing system to processes payments, etc.
GigaSpaces FIFO and UOW
While the FIFO mode provides ordered object consumption, it does so in a very strict sense. It defines an order between space objects based on the time they were written into the space. FIFO does not take into account consuming associated objects as one atomic operation. UOW allows a polling container to process a group of associated objects in the order they have been written in parallel to other processing groups. Multiple polling containers handle different groups concurrently, each group items processed in a FIFO fashion.
When can the GigaSpaces Unit of Work be used?
GigaSpaces UOW can be used in the following cases:
When having many consumers, each should handle a different group (number of groups may be unlimited) where the processing of the items within the group should be done in an ordered fashion as one atomic operation.
When having multiple producers, where data from each producer may be associated with different groups (number of groups may be unlimited) where the processing of the items within the group should be done in an ordered fashion as one atomic operation.
Example use case
Here is a simple scenario illustrates the Unit of Work usage:
1. Client A starts an Order ID 1 and submits a request to buy $1000 worth of shares of IBM
2. Client A starts an Order ID 2 and submits a request to buy $1000 worth of shares of MSFT
3. Client A resumes Order ID 1 and submits a request to increase the purchase of IBM request by $500
4. Client A resumes Order ID 1 and submits a request to cancel the purchase of IBM shares
5. Client A cancels Order ID 2
With the above scenario requests 1, 3 and 4 should be processed as one atomic operation where requests 2 and 5 can be processed in parallel but also as one atomic operation.
How is the GigaSpaces Unit of Work configured?
Multiple polling containers running in the following mode are started:
Using SingleTakeReceiveOperationHandler.
Using one concurrent consumer thread.
Consumed objects in a FIFO mode.
Template set with a different bucketId for each polling container - This ensures no contention or race conditions will be generated.
Using Local Transaction Manager.
The polling container SpaceDataEvent implementation flow:
1. Transaction started and an object at the top of the FIFO chain is taken.
2. To consume the entire group, a takeMultiple is called using a template with the group identity set. The objects are retrieved in FIFO fashion (in order).
3. Group is processed.
4. Transaction is committed.
5. Other groups are processes in-parallel by other polling containers.
UOW Example
Running the Example
You can download eclipse project with example source code, running scripts and configuration.
You can run the UOW Data-Grid with the collocated UOWProcessor within your IDE using the following configuration:
Here is a configuration for a UOW Data-Grid with 2 partitions:
Instead of running the UOWProcessor within your IDE, you can deploy it into the Service Grid.
1. Edit the setExampleEnv.bat to include correct values for the NIC_ADDR variable as your machine IP and the GS_HOME variable as the GigaSpaces root folder.
2. Start the Service-Grid
runAgent.bat
3. Deploy the UOWProcessor PU
deployUOW.bat
This will deploy the UOW Data-Grid with 2 partitions and a backup.
You can run the UOWFeeder within your IDE using the following configuration:
or using the following:
runClient.bat
Example Code and Configuration
The bucket count configured via the UOW Data-Grid pu.xml using the BucketConfiguration Bean