Read-Through and Write-Through Overview

  GigaSpaces 5.X

Documentation Home
Quick Start Guide
Release Notes

Previous release

  Search Here
Searching GigaSpaces Platform 5.X Documentation

                                               

This page is specific to:
GigaSpaces 5.x

If you're interested in another version, click it below:
GigaSpaces 6.0
[GigaSpaces 6.5]

Overview

This section discusses space write and read-through operations and how to configure them by implementing the CacheLoader/Store Interface or by using a standard Hibernate interface.
Read and write-through operations are discussed for two main client API users: the JavaSpaces API using Entry objects and the Map API using POJO objects. Each type of write and read is demonstrated, including single and multiple objects, transactions, and read/write in partitioned and replicated clustered spaces. Diagrams and examples are included.
The integration of the Hibernate interface into the read and write-through operations is discussed and the Mirror space, which allows a space to "write-behind" asychronously to a data source, is described.
The configuration and settings of the CacheLoader and CacheStore Interface are described, including how to configure XML-based ORM mappings.

Enterprise Data Fabric

CacheLoader/Store Interface - Middleware Link to Persistent Data Sources

The write and read-through operations and the CacheLoader/Store Interface described in this chapter are an important part of the GigaSpaces concept of virtualizing and subsuming all the data sources required by an enterprise under one unified management, forming the Enterprise Data Fabric. The system can connect a running client application to such disparate data sources as local data caches, local and remote databases, clustered memory, message stores, and even remote running applications, all in a way that is transparent to the client application. The CacheLoader/Store Interface is the key middleware connection link for loading and storing data to and from persistent data sources.

Virtualization of Tiers - Space-Based Architecture

Similar to the virtualization of data described above, GigaSpaces offers a space-based architecture (SBA) for processes, in which tiers are implemented as services within a shared runtime environment – regardless of the number of processing units actually involved – rather than as discrete tiers of presentation, business logic, and data access executing in autonomous runtimes.
Virtualization of the tiers (decoupling the physical entity from the logical entity) is a key strategy of space-based computing for achieving dynamic scalability.
In a SBA, virtualization of the tiers is achieved by grouping together the tiers required to process the application logic into a single logical processing unit. Scaling is achieved by running multiple instances of those units on multiple machines. In that way the tiers are also spread between the machines and therefore become a virtual entity, i.e. there is no specific server that hosts a specific tier but rather all machines host everything. It is the data and the load that is partitioned between the different units and enables the scaling-out. The overall efficiency with this architecture is also improved since the interaction between the tiers is handled within the same VM and there is no serialization/de-serialization overhead associated with it.

Persisting Space Data into Permanent Storage

There are many situations in which the data in a space needs to be persisted to permanent storage in a database or other external source and retrieved from it, for example:

  • A client process works primarily with the memory space for temporary storage of process data structures and the permanent storage is used to extend or back up the physical memory of the process running the space. In case the data in the space becomes unavailable, for example, due to cache eviction, the backup data in permanent storage can be accessed.
  • A client process works primarily with the database storage and the space is used to make read processing more efficient. Since database access is expensive, the data read from the database is cached in the space where it is available for subsequent fast read operations.
  • When a space is restarted, data from its persistent media files can be loaded into the space cache to speed up incoming query processing.

Bridging the Gap Between Object to Relational

Object-oriented development dominates in the enterprise and most client applications today are written in the Java, C#, and C++ languages. However, the majority of business-critical data is stored in relational database management systems (RDBMS) or similar systems that use record-based (non object-oriented) storage whose data is read by query-based search schemes.
Because of this mismatch, an intermediate object-relational mapping (ORM) step is required to perform translation of object to record on writing data to database and translation of record to object on reading data from database. This intermediate step is implemented in middleware that is detached from and transparent to the client application. Client calls to standard API read and write methods trigger the middleware functionality without need for the client to intervene. Advanced middleware systems permit the client API to formulate and pass a database query for use when reading from the database.
The Hibernate system, an ORM persistence and query service for the Java language, can provide this service for RDBMS. Hibernate allows you to express queries in its own portable SQL extension (HQL), as well as in native SQL. However, the Hibernate system, which is discussed in Write-Read Through using Hibernate Cache Loader/Store, is restricted to the Hibernate API on the client level and does not itself relate to read/write-through caching.

ORM Mapping Files

The object-relational mappings are formalized in XML mapping files used to configure the system. In Hibernate, these files have the type hbm.xml, for example:

<hibernate-mapping>

    <class name="events.Person" table="PERSON">
        <id name="id" column="PERSON_ID">
            <generator class="native"/>
        </id>
        <property name="age"/>
        <property name="firstname"/>
        <property name="lastname"/>
    </class>

</hibernate-mapping>

Alternatively, ORM mappings can be embedded in the code as annotations that are recognized and extracted by the Hibernate processor.

CacheLoader/Store Interface

The GigaSpaces CacheLoader/Store interface handles reading and writing space data to and from a data source when the space determines that such operations are necessary.
The CacheLoader/Store interface:

  • Persists the space data into a RDBMS using your own database model.
  • Interfaces seamlessly with both JavaSpace API and IMap API and is automatically triggered by API read/write operations. It has a unified design for interchangeable use with JavaSpace API and Map API.
  • Allows you to implement ORM translation.
  • Works with a wide variety of data sources.
  • Can be implemented as a Hibernate CacheLoader/Store driver to utilize the Hibernate database access mechanism.
  • Supports batch operations.
  • Supports transactional and non-transactional operations.
  • Supports query delegation to underlying data source.
  • Supports query optimization.
  • Supports read-ahead .
  • Can be used with any cluster topology, replicated or partitioned. Supports shared database and non-shared database mode.

CacheLoader/Store Interface Operation

The basic idea of the CacheLoader/Store interface is to provide user-implementable store and load methods that can be programmed with the ORM required for different types of data sources or, alternatively, can be configured as standard modules that do not require user implementation, such as the Hibernate CacheLoader/Store driver.
The following diagram shows the CacheLoader/Store Interfaces operation schematically.

In general terms, the CacheLoader/Store interface works as follows:

  1. When you write an Entry to the space using the IJSpaces or IMap API, the CacheStore.store() method is called in a class that implements the CacheStore interface.
  2. When you try to read data from the space and the relevant Entry with the specified UID/Primary Key does not exists in the space, the CacheLoader.load() method is called.
  3. The CacheLoader.load() method allows you to retrieve the relevant data based on a key from the database back into the space, which returns it to the client.
  4. The generated object is cached in the space and a subsequent read to this object is served by the space and returned back to the caller.

GigaSpaces ORM Mapping Files – gs.xml

Similar to the hbm.xml mapping files described above, the gs.xml ORM mapping file is used to configure the GigaSpaces system.

The gs.xml file provides translation between data formats on several levels:

  • Provides the relation between the client metadata class for POJO and the space class IGSEntry in the write direction.
  • Provides the relation between the client metadata class and the space class IGSEntry in the read direction. This is relevant for read operations by the Hibernate CacheLoader whose output is POJO.

Cluster Topologies Supported

The CacheLoader/Store Interface can be used with both space replicated and partitioned cluster topologies.

Partitioned Clustered Space vs. Replicated Clustered Space

In a partitioned clustered space, the stored data is divided into different physical partitions, each partition normally running in a different VM, where the entry hash code is used to determine the location of each Entry across the different partitions.
In a replicated clustered space, the data is continuously replicated (synchronously or asynchronously ) to each cluster node, each in a different VM, so that each node always has a copy of every data Entry.

Both partitioned and replicated clustered spaces can have the following possible configurations when using the CacheLoader/Store:

  • Central database for all spaces or individual database server for each space.
  • A backup node for hot failover or without backup node.
  • With or without a local cache.

Partitioned spaces have the advantages of scalability and the ability to perform a parallel query that spans all partitions.

Replication Rules

The following replication rules hold in a replicated space:

  • Replication is performed only on "destructive" operations, those that change the data source content, like write and update, but not for read operations. Thus, if an Entry is loaded from database to a space, it will not be replicated to other cluster nodes.
  • If an Entry is removed from a space due to lease expiry, the removal is replicated to the other cluster nodes.
  • If an Entry is removed from a space due to eviction, the removal is not replicated to the other cluster nodes.

The following sections show how the read/write through operations work in various cluster topologies.

Query Delegation to an Underlying Data Source

The CacheLoader/Store Interface is able to delegate SQL Query information from the client level to the underlying data source. The CacheQuery object is used to encapsulate SQL Query information in a standard and unified way. Whether the IJSpace.read is called with an object template or with an explicit SQL Query, the information is transformed into the standard CacheQuery object and passed as an argument of the CacheLoader.LoadAll or CacheIteratorFactory.iterator methods.

CacheQuery - Encapsulating an SQL Query

The CacheQuery object encapsulates SQL Query information for passing as method arguments in the CacheLoader/Store Interface. The information allows the user to build the relevant query to fetch the required data from the database. The method CacheQuery.getQuery extracts the information.
Depending on the parameter passed in the API read method, CacheQuery can hold one of (Entry/Pojo), SQLQuery or IGSEntry objects.

  • CacheQuery.getQuery() returns IGSEntry in the case a null-valued template is passed.
  • CacheQuery.getQuery() returns SQLQuery in the case a non-null valued template or an ExternalEntry template with Extended match-codes set is passed.
For more details, see Javadoc.

For a multiple read in the client API, the CacheLoader.loadAll method is called as follows:
CacheLoader.LoadAll(collection <CacheQuery>).
The same processing occurs with the method CacheIteratorFactory.iterator(CacheQuery).

CacheQuery Example

In the following example, it is required to get a maximum of 1000 Entry objects of the the type Person (or its sub-classes), where their firstName attribute value is john and the lastName can be anything.
Perform the following call to IJSpace.readMultiple:

Entry[] result = space.readMultiple(person_template , null , 1000);

Where the first parameter person_template is a Person class template with the lastName and firstName attributes:

lastName=null
firstName="john"

If the space does not hold 1000 matching results, the above client JavaSpaces API call invokes the Cacheloader.loadAll where the passed value is a collection of CacheQuery objects.
For each CacheQuery object, a call to CacheQuery.getQuery() returns a SQLQuery object.
SQLQuery.getQuery() returns a string with the value:

firstName='john'

SQLQuery.getClassName() returns the relevant class name, Person. With this information the user constructs the relevant query to hit the database – for example:

select * from Person where firstName='john'

Based on the result set, the return value is created for the Cacheloader.loadAll

Disassembling a Query

When using the GigaSpaces JavaSpaces API with the SQLQuery or GigaSpaces JDBC API, the executed query is disassembled into smaller partial queries when delegated into the CacheLoader. The space assembles the partial results and returns them to the client.

For more details, refer to the Querying the Space section.

Unified Design for Use with JavaSpace API and Map API

The CacheLoader/Store Interface has been designed for consistent and common use when working with JavaSpace API or Map API, making it possible to use the same CacheLoader/Store Interface implementation with either API.

IGSEntry Interface

The generic IGSEntry Interface makes it possible to use the CacheLoader/Store Interface with either API in the same way. An IGSEntry object is passed as a parameter in the CacheLoader/Store Interface methods and provides access to data written into the space by all supported GigaSpaces standard APIs, including GigaSpaces object meta data information such as UID and version ID. It has a common structure that is able to carry either Entry object data, POJO object data or Map key-value object data. Helper methods are provided to extract data from the IGSEntry and to convert the IGSEntry to its original format.

In the future, the CacheLoader will have an option to provide the original user object as part of the interface methods arguments without any explicit conversion.

The sections below describe the flow at the CacheLoader using the IGSEntry parameter.

Helper Conversion Methods

The AbstractCacheLoader helper class and its AbstractCacheLoader.getConvertor() method provided to convert space entry in IGSEntry format to its original format - Entry/POJO and vice versa.
The AbstractCacheLoader.getConvertor() returns an IConverter object that includes methods you can use to convert the space internal represenation of the entry to its original object format to be written into the database or to remote site.

You should extend your CacheLoader/CacheStore/CacheBulk implementation from the AbstractCacheLoader and implement the load and loadAll methods.
The getConverter method that returns an IConverter object that includes the following methods:

CacheBulk as a Mirror Implementation
When implementing the CacheBulk as a Mirror implementation and extending the AbstractCacheLoader, you should return null from the load method and new HashMap() from the loadAll method.
When using the AbstractCacheLoader.getConverter().toObject(IGSEntry), your Entry must include getter and setter methods.

Mapping of JavaSpace API and Map API to CacheLoader/Store Methods

The following tables summarize the correspondence between the client API read and write calls and the CacheLoader/Store Interface methods that are called.

JavaSpaces API:

Client API Item CacheLoader/Store Method
read Single Entry, null template. CacheLoader.load
take Single Entry, null template. CacheLoader.load
read Single Entry non-null template. CacheLoader.loadAll, CacheIteratorFactory.iterator
take Single Entry non-null template. CacheLoader.loadAll, CacheIteratorFactory.iterator,
CacheStore.eraseAll
readMultiple Any template. LoadAll, CacheIteratorFactory.iterator
takeMultiple Any template. CacheLoader.loadAll, CacheIteratorFactory.iterator,
CacheStore.eraseAll
write Single Entry. CacheStore.store
update Single Entry. CacheStore.store
writeMultiple Entry array. CacheStore.storeAll
updateMultiple Entry array. CacheStore.storeAll
write, take, update, writeMultiple, takeMultiple or updateMultiple (under transaction) Entry. CacheBulk.store(List<BulkEntry>)
CacheStore.store,
CacheStore.erase

Map API:

Client API Item CacheLoader/Store Method
get POJO object. CacheLoader.load
remove POJO object. CacheLoader.load, CacheStore.erase
getAll Collection of POJO objects. CacheLoader.load
put POJO object. CacheStore.store
putAll collection of POJO objects. CacheStore.storeAll
put, get, remove, putAll or getAll under transaction. POJO object. CacheBulk.store(List<BulkEntry>)
CacheStore.store, CacheStore.erase

Wiki Content Tree


Your Feedback Needed!

We need your help to improve this wiki site. If you have any suggestions or corrections, write to us at techw@gigaspaces.com. Please provide a link to the wiki page you are referring to.

Labels

 
(None)