Summary: Explains the concepts of the GigaSpaces in-memory data grid (the Space), how to access it and how to configure advanced capabilities such as persistency, eviction, etc.
This section describes the Space, GigaSpaces in-memory data grid implementation. For a business-oriented description of this product offering, see the In-Memory Data Grid page on our corporate website.
The Space enables your application to read data from it and write data to it in various ways.
It also deals with various configuration aspects, such as space topologies, persistency to an external data source and memory management facilities.
The Space as the System of Record
One of the unique concepts of GigaSpaces is that its in-memory data grid (the Space) serves as the system of record for your application.
This means that all or major parts of your application's data are stored in the space and your data access layer interacts with it via the various space APIs. This allows for ultra-fast read and write performance, while still maintaining a high level of reliability and fault tolerance via data replication to peer space instances in the cluster, and eventual persistency to a relational database if needed.
Characteristics of a Space
The space has a number of determining characteristics that should be configured when it's created, as described below.
The Space Clustering Topology
The space can have a single instances (in which case it runs on a single JVM) or multiple instances, in which case it can run on multiple JVMs.
When having multiple instances, the space can run in a number of topologies which determine how the data is distributed across those JVMs. In general, the data can be either replicated, which means it resides on all of the JVMs in the cluster, or partitioned, which means that the data is distributed across all of the JVMs, each containing a different subset of it. With a partitioned topology you can also assign one or more backup space instances for each partition.
Master-Local Space
Regardless of the space's topology, you can also define a "local cache" for space clients, which will cache space entries recently used by the client or a predefined subset of the central space's data (this is often refereed to as "Continuous Query").
The data cached on the client side is kept up to date by the server, so whenever another space client changes a space entry that resides in a certain client's local cache, the space will make sure to update that client.
The Replication Mode
When running multiple space instances, in many cases the data should be replicated from one space instance to another. This can happen in a replicated topology (in which case every change to the data is replicated to all of the space instances that belong to the space) or in a partitioned topology (in case you chose to have backups for the each partition).
There are two replication modes - synchronous and asynchronous. With synchronous replication, data is replicated to the target instance as it is written. So the client code which wrote, updated or deleted data will wait until replication to the target is completed.
With asynchronous replication, this replication is done in a separate thread, and the calling client does not wait for the replication to complete.
Persistency Configuration
The space is an in memory data grid. As such it's capacity is limited to the sum of memory capacity of all the JVM on which the space instances run.
In many cases you have to deal with larger portions of data, or load subset of a larger data set, which resides in an external data source such as a relational database, into the space.
The space supports many persistency options, allowing you to easily configure how it will interact with an external relational database or a more exotic source of data.
It supports the following options, from which you can choose:
Cache warm-up: load data from an external data source on startup
Cache read through: read data from the external data source when it's not found in the space
Cache write through: write data to the external data source when it's written to the space
Cache write behind (also known as asynchronous persistency): write data to the external data source asynchronously (yet reliably) to avoid the performance penalty
Eviction Policy and Memory Management
Since the space is memory-based it is essential to verify that it doesn't overflow and crash. The space has a number of facilities to manage its memory and make sure it doesn't overflow.
The first one is the eviction policy. The space supports two eviction policies: ALL_IN_CACHE and LRU (Least Recently Used). With the LRU policy, the space starts to evicts the least used entries when it becomes full. The ALL_IN_CACHE policy never evicts anything from the space.
The memory manager allows you to define numerous thresholds that control when entries will be evicted (in case you use LRU) or when the space will simply block clients from adding data to it.
Combined, these two facilities enable you to better control your environment and make sure that the memory of the space instances in your cluster does not overflow.
APIs to Access the Space
The space supports a number of APIs to allow for maximum flexibility to space clients when accessing the space:
The core Space API, which is the most recommended, allows you to read objects from the space based on various criteria, write objects to it, remove objects from it and get notified about changes made to objects. It is inspired by the JavaSpaces specification and the tuple space model, although The basic data unit is a POJO, which means the space entries are simply Java objects. This API also supports transactions
Accessing the space from other languages The code space API is also supported in .Net and C++. This allows clients to access the space via these languages. It also supports interoperability between languages, so in effect you can write an object to the space using one language, say C++, and read it with another, say Java
The Map API, which allows you to access entries using a key/value approach. This is only recommended for specific scenarios where you only retrieve objects based on their IDs and would settle for the Map interface which is very limited in functionality comparing to the core API.
The JDBC API, which allows you to access the space similar to how you would access a relational database (note that it has a number of limitations).
Services on top of the Space
Building on top of the core API, the Space also provides higher level services on to the application. These services, along with the space's basic capabilities, provide the full stack of middleware features you can build your application with. The Task Execution API allows you send your code to the space and execute it on one or more nodes in parallel, accessing the space data on each node locally. Event containers use the core API's operations and abstract your code from all the low level details involved in handling the event, such as event registration with the space, transaction initiation, etc. This has the benefit of abstracting your code from the lower level API and allows it to focus on your business logic and the application behavior. Space Based Remoting allows you to use the space's messaging and code execution capabilities to enable application clients to invoke space side services transparently using an application specific interface. Using the space as the transport mechanism for the remote calls allows for location transparency, high availability and parallel execution of the calls without changing the client code.
Spring Integration
The space APIs are integrated tightly with the Spring framework.
This gives you the ability to use all of the benefits that Spring brings to the table, such as dependency injection, declarative transaction management, and well defined application life cycle model.
In addition, the higher level services (remoting and event processing) are also tightly integrated with Spring and follow the Spring framework proven design patters. GigaSpaces XAP provides a set of well defined Spring bindings utilizing Spring's support for custom namespaces, which allows you to easily create and wire GigaSpaces components within Spring.
The Space as the Foundation for SBA
Besides its ability to function as an in-memory data grid, the Space's core features and the services on top of it form the foundation for Space Based Architecture. By using SBA you can gain performance and scalability benefits not available with traditional tier based architectures, even when these include an in memory data grid such as the space.
The basic unit of scalability in SBA is the Processing Unit. The Space can be embedded into the processing unit or accessed remotely from it. When embedded into the processing unit, local services such as event handler and service bean exposed remotely over the space can interact with the local space instance to achieve unparalleled performance and scalability. The Space's built in support for data partitioning is used to distribute the data and processing across the nodes and scaling the application.
What's Next
It is recommended that you read the following sections next:
Cluster-Aware Operations — Supported and non-supported operations, limitations, and considerations when working with a clustered space.
The Space Component — A Space component allows you to create an IJSpace (or JavaSpace) based on a space URL.
Space Filters — Space Filters are interceptors inside the GigaSpaces space engine.
Cluster Replication Filters — How to call custom business logic when data is replicated in a replicated cluster topology.
Space Mode Context Loader — Allows you to load a Spring application context only when the Processing Unit or space is in primary mode, and unload it when the Processing Unit or space is in backup mode.
Space URL — An address, passed to GigaSpace, used to connect to a space and remotely create new spaces as well as enable various characteristics.
The GigaSpace Interface — The JavaSpaces API is abstracted in OpenSpaces by a simple wrapper: the GigaSpace interface.
POJO Support — GigaSpaces JavaSpaces API Plain Old Java Object support - the POJO.
POJO Support - Advanced — GigaSpaces JavaSpaces API Plain Old Java Object support - the POJO. This advanced section deals with the annotations and gs.xml mapping file, ways for troubleshooting, considerations, UID generation and usage as well as frequently used code snippets.
SQLQuery — The SQLQuery class is used to query the space using the SQL like syntax.
Local Cache and Local View — OpenSpaces allows you to easily configure and use the space local view feature using the LocalViewSpaceFactoryBean component and local cache using LocalCacheSpaceFactoryBean.
Persistency — GigaSpaces's persistency approach consists of several paradigms for data persistency, according to the application needs. This section gives a basic overview of each paradigm.
External Data Source — External Data Source (EDS) is a space component that provides advanced persistency capabilities for the space architecture.
Transaction Management — OpenSpaces provides several implementations of Spring's PlatformTransactionManager allowing you to use the GigaSpaces and Jini Transaction Manager.
Space Locking and Blocking — Using optimistic and pessimistic locking to preserve the integrity of changes in multi-user scenarios.
Memory Management Facilities — Setting Space cache policy, memory usage and rules for exceeding physical memory capacity.
Programmatic API (Configurers) — This section describes how you can use OpenSpaces components in a non-Spring environment. The constructs which are used to create and configure GigaSpaces components are called Configurers .
FIFO Support — How to get entries in the same order in which they were written to the space.
JDBC Driver — GigaSpaces allows applications to connect to the IMDG using a JDBC driver. A GigaSpaces JDBC driver accepts SQL statements, translates them to space operations, and returns standard result sets.