Replicated Space Topology

  Search Here
Searching GigaSpaces XAP/EDG 6.0 Documentation

                                               

Summary: In a replicated topology, all space instances contain the same data. Synchronization is done through replication, which can be asynchronous or synchronous.

This page is specific to:
GigaSpaces 6.0

If you're interested in another version, click it below:
GigaSpaces 5.x
GigaSpaces 6.5

Overview

A replicated space refers to a topology in which all space instances contain the same data (symmetric replication). The synchronization between the different space instances is done through replication. The replication can be asynchronous or synchronous. The replication protocol can be either unicast or multicast and can be configured separately per source-target space pair. In general, asynchronous replication is used when you do not want the replication time to affect the application space access time. With asynchronous you might face inconsistency and might lose data when the source space fails during replication.

Synchronous replication is used when you want to ensure that the received replica space performed the operation before the application got an acknowledgement from the space that the application is directly using. The synchronous replication impacts the application performance since every destructive operation need to be replicated to all target spaces before the client application receives acknowledgement for completing the operation.

For more details on all replication options see the Replication Group Options section.
To deploy a replicated space without using the SpaceFinder or the gsInstance CLI option, use the Service Grid Deployment Wizard.

The Service Grid Deployment Wizard allows you to deploy large space clusters very easily without specifying the member's ID. A clustered space deployed via the Service Grid supports dynamic reallocation, enhanced resiliency, and self-healing capabilities.

For more details, refer to: Using Service Grid to Deploy Replicated Space (deprecated).

Replicated Remote Space

The basic and most common cluster topology is the Replicated space. This topology provides both high-availability and scalability.
The Replicated remote space has the following characteristics:

  • There is no single point of failure with this topology since each object written into the cluster is located in several spaces.
  • You may distribute the load by allowing every application instance to connect to a different instance of the cluster nodes. Data is fully replicated across all space instances and every destructive operation is replicated to all cluster nodes. To ensure coherent updates – you should use the optimistic locking protocol and transactions. The optimistic locking locks the object(s) only for the duration of the update operation.
  • When using transactions the replication to target spaces will accrue only when the commit is called.
  • You may run the space process on the same machine as the application or on another machine.
  • Each space can be persistent or transient based, and you can mix persistent and transient spaces within the same cluster.
  • With Replicated Remote cache, the application's access to the cache involves a remote call.

The following figure demonstrates a topology where each space is running in a dedicated process (not embedded within the application process) where each application is using a different cluster space node.

When performing a read operation these are not replicated and occurs within the space the client is using for read operations.
Updating a replicated space requires pushing the new version of the data to all other cluster nodes.

Replicated Embedded Space

The replicated embedded space (also known is P2P topology) allows the space to run embedded within the application process. It provides most of the replicated remote space topology with the following exceptions:
The space is running within the same process as the application – impacts the application process memory footprint.
Application access to its cache instance data does not require any remote calls – much faster than the remote replicated space topology.

Running Replicated Space

Synchronous Replicated

Setting up replicated space topology with synchronous replication mode requires that the URL used to start the space includes the following:

cluster_schema=sync_replicated
When running persistent cluster using the primary_backup or the partitioned-sync2backup schemas, and starting up all cluster nodes, primary spaces – the spaces that were running as active before the cluster was shut down, must be started first, and the backup spaces must be started second.
In this way, when the backup space is started, it will synchronize with the primary space's latest data (underlying database).
Starting the backup spaces first will lead to data loss and inconsistent data.
In the future we will remove this limitation.

To start 3 space instances in embedded mode (running as part of your application process) replicating their data and operations synchronously, you should have the following setup:

JavaSpaces API:

Member 1
IJSpace space =
(IJSpace)SpaceFinder.find("/./mySpace?cluster_schema=sync_replicated&total_members=3&id=1");
 
Member 2
IJSpace space =
(IJSpace)SpaceFinder.find("/./mySpace?cluster_schema=sync_replicated&total_members=3&id=2");
 
Member 3
IJSpace space =
(IJSpace)SpaceFinder.find("/./mySpace?cluster_schema=sync_replicated&total_members=3&id=3");

Map API:

Member 1
IMap cache =
(IMap)CacheFinder.find("/./mySpace?cluster_schema=sync_replicated&total_members=3&id=1");
 
Member 2
IMap cache =
(IMap)CacheFinder.find("/./mySpace?cluster_schema=sync_replicated&total_members=3&id=2");
 
Member 3
IMap cache =
(IMap)CacheFinder.find("/./mySpace?cluster_schema=sync_replicated&total_members=3&id=3");

You can run the space as a separate process using the <GigaSpaces Root>\bin\gsInstance CLI and access it remotely as specified in the Running a Space Instance section.

You should provide the space URL for one of the cluster nodes or the cluster name as specified at the Space URL section.

To start 3 space instances in remote mode (running as a stand alone process) replicating their data and operations synchronously, you should have the following setup:

Member 1
gsInstance "/./mySpace?cluster_schema=sync_replicated&total_members=3&id=1"
 
Member 2
gsInstance "/./mySpace?cluster_schema=sync_replicated&total_members=3&id=2"
 
Member 3
gsInstance "/./mySpace?cluster_schema=sync_replicated&total_members=3&id=3"

Asynchronous Replicated

Setting up replicated space topology with asynchronous replication mode is that the same as for synchronous replication, with one difference: the cluster_schema should be async_replicated – i.e.:

cluster_schema=async_replicated
The async_replicated cluster schema does not enforce any of the spaces to run in passive mode - i.e. backup mode. All spaces run in active mode and are available for incoming client requests. For the primary-backup model, see the Replicated Primary-Backup topology below.

Replicated Primary-Backup

In primary-backup mode, replication is used to maintain a backup copy of a single primary space. In this configuration, the replication between the primary and the backup is usually synchronous. This ensures the backup copy to have the exact set of information as the primary cache at any given point in time. At any given time, only the primary instance will be available for the client application.

Multiple Primary Spaces are not Supported with the primary-backup Cluster Schema
The primary-backup cluster schema does not support multiple primary spaces that have multiple backups for each primary. The primary-backup schema supports only one primary space with multiple backup spaces.

The multi level cache example allows you to form such a cluster. For more details, refer to the readme file.

For details on how a network of processors elects a unique processor (a leader), and how to avoid split-brain scenarios, refer to the Active Election and Avoiding Split-Brain Scenarios section.

Running via API

When using the Primary-Backup topology, the primary and the backup should identify themselves at startup by having the ID and backup_id URL parameters as described below:

  • There should be total_members={number of primary instances, number of backup instances per primary} as part of the URL when a backup configuration is used.
  • The primary ID should be 1 and is referred to using the ID property.
  • The backup ID should be in the range of 1 – number of backups and is referred to using the backup_id property.
When running persistent space using primary_backup or partitioned-sync2backup schemas and shutting down all cluster nodes you should make sure you start first the spaces that were running as active spaces and later the backup spaces.
This will make sure that when the backup space started it will sync with the primary space latest data (underlying database) and not vice verse which may lead to data lose and in consistent data.
In the future we will remove this limitation.

For a primary space instance with one backup, use the following setup:

JavaSpaces API:

Primary instance
IJSpace space = (IJSpace)SpaceFinder.find("/./mySpace?
schema=default&cluster_schema=primary_backup&total_members=1,1&id=1");
 
Backup instance
IJSpace space = (IJSpace)SpaceFinder.find ("/./mySpace?
schema=default&cluster_schema=primary_backup&total_members=1,1&id=1&backup_id=1");

Map API:

Primary instance
IMap cache = (IMap)CacheFinder.find("/./mySpace?
schema=cache&cluster_schema=primary_backup&total_members=1,1&id=1");
 
Backup instance
IMap cache = (IMap)CacheFinder.find ("/./mySpace?
schema=cache&cluster_schema=primary_backup&total_members=1,1&id=1&backup_id=1");

In the above example, the space name (mySpace or mySpace) should be identical across the different nodes.

The primary space container name that is generated internally is defined by the following pattern:

<space_name>_container<id>

The primary space fully qualified cluster member name is as follows:

<space_name>_container<id>/<space_name>

The backup space container name that is generated internally is defined by the following pattern:

<space_name>_container<id>_<backup id>
The backup space fully qualified cluster member name is as follows:
<space_name>_container<id>_backup_<backup id>/<space_name>

Running via CLI

For primary space instance with two backup space instances, use the following setup:

JavaSpace API:

Primary instance
gsInstance "/./mySpace?schema=default&cluster_schema=primary_backup&total_members=1,1&id=1"
 
Backup instance
gsInstance "/./mySpace?schema=default&cluster_schema=primary_backup&total_members=1,1&id=1&backup_id=1"

Map API:

Primary instance
gsInstance "/./mySpace?schema=cache&cluster_schema=primary_backup&total_members=1,1&id=1"
 
Backup instance
gsInstance "/./mySpace?schema=cache&cluster_schema=primary_backup&total_members=1,1&id=1&backup_id=1"

Replicated Space with Local cache (Master-Local)

The replicated space with local cache topology allows you to run the space instances remotely as different processes but still maintain a local cache running within the application process to be used for read operations. The local cache is notified with changes via push or pull update mode. This topology boosts the read performance since there are no remote calls involved with the read operations.

To access the replicated space from remote client you should start a replicated space as described in the Replicated Remote space section and use the following as part of your application to get a proxy to the cluster:

JavaSpace API:

IJSpace space =
(IJSpace)SpaceFinder.find("jini://*/*/mySpace")

Map API:

IMap cache =
(IMap)CacheFinder.find("jini://*/*/mySpace")

To run Replicated space with local cache at the client application you should start a replicated space as described in the Replicated Remote space section and use the following as part of your client application to get a proxy to the cluster:

JavaSpace API:

IJSpace space =
(IJSpace)SpaceFinder.find("jini://*/*/mySpace?useLocalCache")

Map API:

IMap cache =
(IMap)CacheFinder.find("jini://*/*/mySpace?useLocalCache")

Replicated Ownership Space

The ensure data coherency for update operations you may use the ownership topology. With this topology, all spaces replicate all their data and operations to other spaces, but each space is responsible for performing updates on different segments of the data.There is no option to perform update operations on two copies of the same object located in different spaces at the same time – updates are always routed to the space that owns the entry. Read operations can be done from any of the cluster members.

Running Ownership Space

To set up an ownership model space, you should use the ownership-replicated cluster schema as part of the space URL that starts the space, together with the total cluster members and space instance member ID.

To start ownership space where each node runs as a stand alone process with 3 members, you should have the following setup:

JavaSpace API:

Member 1
gsInstance "/./mySpace?schema=default&cluster_schema=ownership-replicated&total_members=3&id=1"
Member 2
gsInstance "/./mySpace?schema=default&cluster_schema=ownership-replicated&total_members=3&id=2"
Member 3
gsInstance "/./mySpace?schema=default&cluster_schema=ownership-replicated&total_members=3&id=3"

Map API:

Member 1
gsInstance "/./mySpace?schema=cache&cluster_schema=ownership-replicated&total_members=3&id=1"
Member 2
gsInstance "/./mySpace?schema=cache&cluster_schema=ownership-replicated&total_members=3&id=2"
Member 3
gsInstance "/./mySpace?schema=cache&cluster_schema=ownership-replicated&total_members=3&id=3"

To access the clustered space from the application have the following as part of your code:

JavaSpace API:

IJSpace space =( IJSpace)SpaceFinder.find("jini://*/*/mySpace");

Map API:

Map cache =(IMap)CacheFinder.find("jini://*/*/mySpace");


GigaSpaces 6.0 Documentation Contents (Current Page in Bold)

    Java

    C++

    .NET

    Middleware Capabilities

    Configuration and Management

Add GigaSpaces wiki search to your browser search engines!
(works on Firefox 2 and Internet Explorer 7)

Labels

 
(None)