|
Summary: Clustering, replication and partitioned replication between physically remote distributed sites.
OverviewEnterprise wide applications require the sharing of data across remote geographical sites. Several instances of the application need to share data, while the communication bandwidth and speed between the different sites is relatively slow. In many cases, due to security and limited bandwidth, each site requires only a portion of the local site data to be replicated to other remote sites. In such cases, data should be filtered before being replicated to other remote sites based on user business logic. Support for Wide-Area Network (WAN) data distribution is addressed by the following components:
This section illustrates several patterns you can apply to construct a WAN-based cluster. Replication Over WAN – One Cluster Using Different GroupsReplication Over WAN is archived by using one cluster configuration with multiple groups: two load-balancing groups, and one replication group using the replication matrix. With this topology, each partition at each site has a replica space. Each partition directly replicates its data and operations into his partner.
Multi-Cluster Architecture over WAN using IWorkerThe IWorker implementation is used to synchronize operations conducted at one local site with operations conducted in another local site. In this way, you may have different clusters with a different number of partitions at each site. Every site runs a Gateway IWorker implementation that is notified with any changes occurring in each local site partition. The IWorker registers for notifications once the space hoting it started. The IWorker will receive notifications only if the space is in active mode.
The Gateway Worker delegates each operation conducted at the local site into the remote site. Since the Gateway Worker acts as a regular client when accessing the remote site, every space operation is distributed into the remote site partitions according to the remote site load-balancing policy.
Multi-Cluster Architecture Over the WAN using Mirror ServiceAn advanced approach you can take when constructing a WAN-based environment is to use embedded spaces running together as part of a mirror to provide an enhanced level of high-availability to the data passed across the different sites, and also better network utilization. With this approach, the mirror writes its incoming data from the local site spaces into an embedded space. The embedded space is part of a different cluster and has a peer in each remote site. In this way, data is replicated in an asynchronous manner across the WAN. If preferred, the embedded space can use the persistent schema running in LRU cache policy mode, allowing the embedded space to evict data from its cache and allowing an unlimited amount of data storage without exposing the VM to out-of-memory problem. Two remote sites are shown below, A and B using a 2-way cluster mirror gateway. The gateway cluster acts as one replication channel. With this approach, an IWorker implementation is not used and there are no proxy operations conducted directly into the remote site – i.e. there is no entity at site A that is using site B's clustered proxy to interact directly with spaces at site B over the WAN. Local site operations are distributed into the remote site via a remote agent – this is the gateway cluster. Each gateway cluster node runs within the same JVM as the local site mirror – i.e. the gateway cluster node runs as an embedded space in the mirror implementation. Once the data has been replicated from Site A's gateway cluster node to Site B's gateway cluster node, the gateway cluster node at site B fans it out into the relevant Site B cluster spaces via its output replication filter. The output replication filter uses Cluster B's clustered proxy – which provides transparent use of Cluster B's load-balancing and failover policies. With this approach, there is one replication channel for all site nodes – you can have multiple partitions at site A or B – all using one replication channel. Since we have a dedicated cluster acting as this replication channel, we can control it very easily using the asynchronous replication settings. This capability is very important in WAN-based environments, because the network utilization is measured and might be expensive.
Resonance AffectOne important issue to deal with when implementing the cluster mirror gateway is the resonance affect. The resonance affect occurs when:
(SiteA --> MirrorA --> GatewayA --> GatewayB --> SiteB --> MirrorB --> GatewayB --> GatewayA --> SiteA, etc.) To prevent the resonance effect, filter out data originating from a remote site when the mirror writes its data into its embedded gateway cluster node – i.e. Entries originated at site A and replicated from Site A to site B would not be replicated back from Site B to site A. By default, the Entry's UID includes the originated container and space name – so you can use the Entry's UID to perform the filtering. Mirror as Service Grid ServiceThe Service Grid can monitor the mirror and the embedded gateway life cycle and restart these once there is a failure. To accomplish this, you need to deploy the mirror into the Service Grid. Once the mirror is started, it constructs the embedded gateway node as part of its initialized() method. In this case, the Service Grid is unaware of the embedded gateway node. If the mirror fails, the source space accumulates the data as part of its redo log to be sent to the mirror. There is no need to provide high-availability for the mirror (i.e replicated cluster for the mirror service). There is a handshake mechanism between the source space and the mirror. If the mirror implementation CachBulk.store() fails due to a mirror JVM failure, the source space resends the last batch to the mirror again when the mirror is restarted. Additionally, if the mirror fails, the Service Grid restarts it on some other available GSC – this allows the source space(s) to resend the last replication packet to the mirror (and through it to the remote site via the embedded gateway cluster space). Recovery from Complete ShutdownIf there is a complete shutdown of all spaces, data must be pushed from relevant data source (database, remote site) into the local site using a helper service or a program that fetches all relevant data and writes it into the restarted site. One simple way to do this is via the CacheLoader interface implementation which is activated on startup of the spaces (see Read-Through and Write-Through Examples) – data loaded into the space via the CacheLoader would not send back into the mirror. In this case, the reasonance affect would not have occurred. More in this SectionFor more details, refer to the following sections: |
Wiki Content Tree
Your Feedback Needed!
We need your help to improve this wiki site. If you have any suggestions or corrections, write to us at techw@gigaspaces.com. Please provide a link to the wiki page you are referring to.


Add Comment