OpenSpaces Overview

  Search Here
Searching GigaSpaces XAP/EDG 6.5 Documentation

                                               

Summary: The theory behind OpenSpaces and Space-Based Architecture (SBA); Developing high throughput EDA/SOA applications using OpenSpaces and SBA; OpenSpaces architecture; SLA-driven containers.

This page is specific to:
GigaSpaces 6.5

If you're interested in another version, click it below:
GigaSpaces 6.0

Overview

OpenSpaces is a development framework designed to enable scaling-out of stateful applications in a simple way using Spring. It is an open source initiative from GigaSpaces and supports the Space-Based Architecture model out-of-the-box. OpenSpaces is useful for Spring users, service-oriented and event-driven architectures (SOA/EDA), transactional applications, real-time analytics, and Web 2.0 applications.

For a list of FAQs, including questions around the licensing model and positioning, click here.

The Theory Behind OpenSpaces and Space-Based Architecture

A detailed description of the model and the theory that led to the inception of OpenSpaces as a next-generation development and runtime platform as the means to achieve scalability in a high-throughput, stateful environment is provided in the following white paper: The Scalability Revolution: From Dead End to Open Road.

In this paper, we define scalability, and show that inherent scalability barriers represent a dead end for today's tier-based business-critical applications. We argue that in order to survive, these applications must achieve linear scalability, and that the only way to do so is to switch from the tier-based model to a new architectural approach. We suggest a novel approach in which applications are partitioned into self-sufficient Processing Units, and present Space-Based Architecture (SBA) as a practical implementation of this approach. We demonstrate that SBA guarantees both linear scalability and simplicity for designers, developers and administrators - transforming scalability from dead end to open road.

The Space-Based Architecture and the End of Tier-based Computing white paper describes how changes in the IT resource landscape, such as memory capacity , network speed and the emergence of powerful and new multi-core commodity hardware, and the introduction of SOA/Grid architectures, tout the promise of achieving true linearly-scalable systems at a lower cost. It introduces how a Space-Based Architecture (SBA) approach can be used as a means to transforming existing tier-based applications into linearly and dynamically scalable services.

OpenSpaces was built to be an implementation of the theory behind these concepts and make the development of applications based on this model as simple as Spring.

Developing High-Throughput EDA/SOA Applications using OpenSpaces and Space-Based Architecture

The simplest way to understand the way OpenSpaces utilizes Space-Based Architecture to enable high-throughput EDA/SOA is by an example. We will use a trading application example (more specifically, an Order Management System (OMS)), because it is a classic case of an application with highly demanding scalability and latency requirements in a stateful environment. (Note: The scope of this description is to focus on the unique elements of SBA at a very high level. It is outside the scope of this section to go into the details of what's required to build a trading application).

A trading application usually consists of a data feed - or trade requests - that flows into the system in some sort of financial standard format (e.g., FIX). These feeds need to be matched, with very low latency, against other trades that exist in the market. The business logic typically includes the following steps:

  1. Parsing and validation (transforming FIX format into a binary object and validating that it conforms to certain rules).
  2. Matching (which queries the data store to find a matching trade, and executes the deal).
  3. Routing (which routes details of the deal to interested parties).

The application needs to provide a 100% guarantee that once a transaction enters the system it will not be lost. It also needs to keep end-to-end latency (latency from the time the system receives a trade to the time the business process ends) to a fraction of a millisecond - and ensure this low latency is not affected by future scaling.

The first step in building such an application with SBA is to define its business logic components as independent services - Enrichment Service (parsing and validation), Order Book Service (matching and execution), Reconciliation Service (routing):


To reduce the latency overhead of communication between these services, they are all collocated in a single Virtual Machine (VM). To eliminate the network overhead of communication with the messaging and data tiers, Messaging Grid and Data Grid instances are both collocated in the same VM. All the interaction with all the services is done purely in-process, bringing I/O overhead to a minimum in both the data and messaging layers.

This collocated unit of work (which includes business logic, messaging and data) is called a Processing Unit. Because the Processing Unit encompasses all application tiers, it represents the application's full latency path. And because everything occurs in-process, latency is reduced to an absolute minimum.


Scaling is achieved simply by adding more Processing Units and spreading the load among them. Scaling does not affect latency, because the application's complexity does not increase. Each transaction is still routed to a single Processing Unit, which handles the entire business transaction in-process, with the same minimal level of latency.

We can see that the trading application guarantees both minimal latency and linear scalability - something that would be impossible with a tier-based, best-of-breed approach (in other words, with separate products to manage business logic, data and messaging).

OpenSpaces Architecture

The following diagram outlines a typical architecture of an application built with OpenSpaces:

Processing Unit

At the heart of the application is the processing-unit. A processing-unit represents the unit of scale and failover of an application. It is built as a self-sufficient unit that can contain all the relevant components required to process a user's transaction under the same unit. This includes the messaging component required to route transactions between processing units, as well as provides a mean for communication between services that are collocated within the processing unit itself; and business logic units, which are essentially POJOs that process events delivered from the messaging component and data component, that holds the state required for the business logic implementation.

The processing-unit is built as an extension of the Spring application context, so developing of a processing unit looks just like a normal development of any Spring application context. In addition to the standard Spring framework, it provides specific components designed primarily to enable rapid development of SOA/EDA based applications. These components are explained below.

Declarative Event Containers

There are basically two main types of event containers - Polling and Notify containers. Event containers are used to abstract the event processing from the event source. This abstraction enables users to build their business logic with minimal binding to the underlying event source, whether it is a Space-based event source, a JMS event source, etc.

The "wiring" between the POJO service and the event handler is done in a declarative manner through spring configuration:

<os-events:notify-container id="eventContainer" giga-space="gigaSpace">

    <os-events:notify write="true" update="true"/>

    <os-core:template>
        <bean class="org.openspaces.example.data.common.Data">
            <property name="processed" value="false"/>
        </bean>
    </os-core:template>

    <os-events:listener>
        <os-events:annotation-adapter>
            <os-events:delegate ref="simpleListener"/>
        </os-events:annotation-adapter>
    </os-events:listener>
</os-events:notify-container>

The POJO service is where the user writes his business logic. It is very similar to a Message Driven Bean known from the J2EE framework, or to message-driven POJOs in Spring. The code snippet below is an example of what a POJO service looks like. It uses an annotation (@SpaceDataEvent) to mark the method that is triggered by a specific event.

public class DataProcessor {

    @SpaceDataEvent
    public Data processData(Data data) {
        data.setProcessed(true);
        return data;
    }
}

GigaSpace - Core Middleware Component

GigaSpace component is a POJO driven abstraction of the JavaSpaces specification. JavaSpaces is a service specification. It provides a distributed object exchange/coordination mechanism (which might or might not be persistent) for Java objects. It can be used to store the system state and implement distributed algorithms. In a space, all communication partners (peers) communicate by sharing states. It is an implementation of the Tuple spaces idea.

JavaSpaces is used when someone wants to achieve scalability and availability, while reducing the complexity of the overall system. Processes perform simple operations to write new objects into a space, take objects from a space, or read (make a copy of) objects from a space.

The goal behind the GigaSpace abstraction is to provide a simpler interface that fits into a POJO-driven architecture such as Spring through the following principles:

  • POJO Entries - the data model in JavaSpaces is an Entry. An Entry has to inherit from a specific interface (Entry). Attributes are public, non-transient Java objects. This model is quite different from the model that was written before the POJO model, and became common by JEE frameworks such as JPA and Hibernate, which are now based on POJOs. The POJO data model is basically a Java Bean representation with annotations, that extend that model with specific meta-information such as indexes definition, persistency model, etc. The new GigaSpace model uses POJO-driven Entries; defines annotations for defining indexes, persistency model, replication semantics, etc.; and follows the same logic and semantics that are used today in the JEE world. This makes the integration of Space-Based Architecture with JEE a more native fit. Note that Entries can still be used with the GigaSpace interface.
  • Declarative transactions - the JavaSpacess API uses explicit transaction semantics, in which transactions-handling is provided as an argument per method. While this model provides a finer level of granularity, it exposes more complexity to the developer. The transaction and locking semantics that are provided in the specification support a limited set of transaction semantics. Spring uses a declarative transaction model, which is basically an implicit transaction. Users can use annotations or XML to define specific transaction/locking semantics.
  • Generics support - users can use generics to avoid unnecessary casting and make their interaction with the space more type-safe.
  • Overloaded methods - the GigaSpace interface uses overloaded methods, that can use defaults to reduce the amount of arguments passed in read/take/write methods.

Using GigaSpace Component in the Context of EDA/SOA Applications

The space serves several purposes in a EDA/SOA type of applications:

  • Messaging Grid - in this case, the space is used as a distributed transport that enables remote and local services to send and receive objects based on their content. In a typical Space-Based Architecture, the space is used to route requests/orders from the data source to the processing-unit, based on a predefined affinity-key. The affinity-key is used to route the request/order to the appropriate processing-unit. Since it is optimized to run in-memory, it is used also as a mean to enable the workflow between the embedded POJO services.
  • In-Memory Data Grid - in this case, the space is used as a distributed object repository, that provides in-memory access to distribute data. Data can be distributed in various topologies - partitioned and replicated are the main ones. In a typical Space-Based Architecture, the space instances are collocated within each processing-unit and therefore provide local access to distributed data required by POJO services running under that processing-unit. The domain model is also POJO-driven. Data objects are basically Java Beans with annotations, which add specific metadata required by the Data Grid to mark indexed fields, the affinity-key, and whether the object should be persisted or not, as can be seen in the code snippet below:
    @SpaceClass
    public class Data {
    
        @SpaceId
        public Long getId() {
            return this.id;
        }
    
        @SpaceRouting
        public Long getType() {
            return this.type;
        }
    }
  • Processing Grid – a processing-grid represents a particular and common use of the space for parallel transaction processing using a master/worker pattern. In a Space-Based architecture, the processing-grid is implemented through a set of POJO services that serve as the workers and event containers, that trigger events from the space into and from these services. Requests/orders are processed in parallel between the different processing-units, as well as within these processing units in case there is a pool of services handling the event.

Space Based Remoting

Space Based Remoting allows for POJO services that are collocated within a specific processing unit to be exposed to remote clients, like any other RMI service. Spring provides a generic framework for exposing and invoking POJO-based services. OpenSpaces utilizes the Spring remoting framework to enable POJO services to expose themselves through the space, as illustrated in the diagram below:


The client uses the SpaceRemotingProxyFactoryBean to create a space-based dynamic proxy for the service. The client uses the proxy to invoke methods on the appropriate service instance. The proxy captures the invocation and creates a generic command Entry with the information on the service-instance, the method-name, and arguments; and calls the space write operation to send the command to the service implementation, followed by a blocking take for the response.

A service that needs to be exported uses the SpaceRemotingServiceExporter to export itself. The SpaceRemotingServiceExporter creates a service-delegator listener that registers for invocation commands by calling the take method on the space. The command contains information about the instance that needs to be invoked, the method and the arguments. The delegator uses this information to invoke the appropriate method on the POJO service. If the method returns a value, it captures the value and uses the space write method to write a response Entry.

Benefits (Compared to RMI):

  • Efficiency – unlike RMI, space-based remoting leverages the fact that the space is the network gateway, and therefore doesn't require any additional sockets or I/O resources beyond the ones that have already been allocated to the space.
  • Scalability – the client stub can point to a cluster of processing units, each containing different instances of the same service for scalability. The proxy utilizes the space clustered proxy for load-balancing of the requests between processing units.
  • Continuous high availability – since the client proxy doesn't point directly to a specific server but to a space proxy, it remains valid during failover or relocation of a service, i.e. - if a service fails, the command is automatically routed to the backup processing-unit. The POJO service contained in this unit immediately picks up the request and responds instead of the failed service, thus enabling smooth continuation of the request during an event of failure.
  • Loosely coupled – a single proxy can point to a set of service instances. This provides the flexibility of invoking methods on a single service, and perform broadcast operations, i.e. invoke multiple services at the same time, or only a single service regardless of its physical location.
  • Synchronous/Asynchronous invocation – a client can choose to invoke a method and wait for a result (synchronous invocation). It can also invoke a method and pick up the result at a later stage.

SLA-Driven Container

OpenSpaces SLA Driven Container, that allows you to deploy a processing unit over a dynamic pool of machines, is available through a SLA-driven container, formerly known as the Grid Service Containers - GSCs. The SLA-driven containers are Java processes that provide a hosting environment for a running processing unit. The Grid Service Manager (GSM) is used to manage the deployment of the processing-unit based on SLA. The SLA definition is part of the processing-unit configuration, and is normally named pu.xml. The SLA definition defines the number of PU instances that need to be running at a given point of time, the scaling policy, the failover policy based on CPU, and memory or application-specific measurement.


The following is a snippet taken from the example SLA definition section of the processing unit spring configuration:

<os-sla:sla cluster-schema="partitioned-sync2backup" number-of-instances="2" number-of-backups="1"
            max-instances-per-vm="1">
    <os-sla:monitors>
        <os-sla:bean-property-monitor name="Processed Data"
                                      bean-ref="dataProcessedCounter"
                                      property-name="processedDataCount" />
    </os-sla:monitors>
</os-sla:sla>


GigaSpaces 6.5 Documentation Contents (Current Page in Bold)

    Java

    C++

    .NET

    Middleware Capabilities

    Configuration and Management

Add GigaSpaces wiki search to your browser search engines!
(works on Firefox 2 and Internet Explorer 7)

Labels