Overview
Distributed data caching allows system designers to optimize data access by situating the data close to the business logic - wherever that application logic might be located. GigaSpaces provides rich caching functionality using distributed memory resources – a GigaSpaces space functions as a cache instance, and clusters of spaces allow deployment of difference cache topologies.
The Need for Distributed Caching
Today's business systems must accommodate growing numbers of end-users accessing an ever-expanding set of business applications. Many IT organizations face the challenge of delivering the scalability and responsiveness needed for near-real-time applications.
Trying to solve the problem simply by adding more hardware is an expensive and often ineffective response. Hardware is almost always the brute force tool, but often a better alternative is a calibrated software solution. Distributed caching is one such solution. But while caching solutions provide scalability and data accessibility, they often struggle to meet the requirements of highly transactional applications, resulting in limited performance. In addition, Many applications require ways to utilize available memory resources, in order to boost performance.
GigaSpaces Cache Usage Scenarios
Different applications may have different caching requirements. Some applications require on-demand loading from a remote cache, due to limited memory; others might use the cache for read-mostly purposes; transactional applications need a cache that handles both write and read operations and maintains consistency.
In order to address these different requirements, GigaSpaces Cache is policy-driven. Most of the policies do not affect the actual application code, but rather affect the way each space/cache instance interacts with other instances. GigaSpaces cache policies support the following usage scenarios, among others.
Expand this...
- Reducing I/O overhead when accessing distributed information – accessing a remote process or a machine hard drive are expensive operations. The GigaSpaces Data Grid improves access to remote data sources by transferring data closer to the application. The cache provides a mechanism for storing, synchronizing, and managing the data from the remote data source in local memory.
- Reliable memory storage – caching adds reliability and sharing capabilities for data stored in memory. An application can maintain critical information in memory, and other applications can access it directly. This can occur without compromising either performance or reliability.
- Distributed session sharing – session information (such as counters, user billing information, user profiles, and in many cases, HTTP sessions), is transient in nature. It is critical to an application during a session, or even during a specific operation; it must be accessed at high speed during the session, but is completely useless when it ends. The GigaSpaces Data Grid stores session information in a distributed memory unit, allowing high-speed access and sharing by multiple applications.
- Caching of personalization information in a portal application – personalization usually requires dynamic compilation of pages dependent on a user's profile. Since HTML pages in a portal can be built from multiple portlets and pages, this process may require access to multiple remote processes, which can be very expensive. In a distributed portal environment, the user profile must be maintained consistently across all portal instances. In such an environment, the GigaSpaces Data Grid enables access to profile information at in-memory speeds, while maintaining consistency across multiple portal instances.
- Grid – a Grid environment allows utilization of distributed CPU resources, in order to perform complex computational tasks. One of the bottlenecks in performing such tasks is access to the information associated with these tasks (e.g. task information, previous results, conversion tables, processing rules). In many cases, this information resides either in a centralized database or in a file system. At some level, the storage bottleneck can become a real obstacle to parallelization. The GigaSpaces Data Grid releases the bottleneck by allowing each processing unit to load the information into its memory address space, on-demand.
Benefits of a JavaSpaces Cache
The JavaSpaces model was designed to address the broader issues in the distributed information domain, including caching, messaging, data sharing, data processing and distribution. This is all done using a single technology and a simple set of APIs. Following are the main highlights of JavaSpaces-based cache architecture.
Expand this...
- Based on proven theoretical models and standards developed at Yale University in the 1980s and adopted by Sun Microsystems in the late 1990s, as the data-sharing model in the Jini environment.
- Reduced complexity – the JavaSpaces model is a high-level middleware that utilizes messaging, RPC and DBMS technologies implicitly. Therefore, the complexity of maintaining a consistent, shared state is hidden from the application programmer.
- Designed for SOA architecture and Grid – the JavaSpaces specification is part of the Jini framework that provides an SOA (Service Oriented Architecture) platform for Java applications. The JavaSpaces model was designed as a service; it therefore fits natively with other SOA implementations, such as Grid, J2EE, web services and .Net.
- Combination of messaging and caching – simplifies parallel processing tasks through messaging and synchronization capacities, while bringing information closer to the processing unit, via the caching infrastructure. The end result is a shared address space that enables data collaboration among different distributed processing entities.
- Rich functionality – the JavaSpaces API offers Two-Phase Commit transaction support, a leasing model, blocking operation semantics, event notification models, and advanced matching capabilities (based on template matching via a simple API).
- Interoperability among Java, J2EE, .Net, C++ – the GigaSpaces master-local cache is optimized for read-mostly scenarios. Portability is one of its advantages; the cache can be accessed from any of its supported interfaces, such as Java, J2EE, .Net, and C++.
|
Section Contents
|