Skip to content
GigaSpaces Logo GigaSpaces Logo
  • Products
    • InsightEdge Portfolio
      • Smart Cache
      • Smart ODS
      • Smart Augmented Transactions
    • GigaSpaces Cloud
  • Roles
    • Architects
    • CXOs
    • Product Teams
  • Solutions
    • Industry Solutions
      • Financial Services
      • Insurance
      • Retail and eCommerce
      • Telecommunications
      • Transportations
    • Technical Solutions
      • Operational BI
      • Mainframe & AS/400 Modernization
      • In Memory Data Grid
      • Transactional and Analytical Processing (HTAP)
      • Hybrid Cloud Data Fabric
      • Multi-Tiered Storage
      • Kubernetes Deployment
      • Streaming Analytics for Stateful Apps
  • Customers
  • Company
    • About GigaSpaces
    • Customers
    • Partners
    • Support & Services
      • University
      • Services
      • Support
    • News
    • Contact Us
    • Careers
  • Resources
    • Webinars
    • Blog
    • Demos
    • Solution Briefs & Whitepapers
    • Case Studies
    • Benchmarks
    • ROI Calculators
    • Analyst Reports
    • eBooks
    • Technical Documentation
  • Contact Us
  • Try Free

Fast Recovery for Fast Data: Three Preload Strategies with XAP 12.1

Subscribe to our blog!

Subscribe for Updates
Close
Back

Fast Recovery for Fast Data: Three Preload Strategies with XAP 12.1

Ali Hodroj April 2, 2017
5 minutes read

As enterprises virtualize more business critical data for fast data workloads, business continuity and disaster recovery become an increasing concern. With fast data processing, real-time streaming data analytics, and just-in-time business decisions becoming the key value creators for the insight-driven business, the question of using in-memory computing technologies (in-memory data grids, streaming engines) has changed from an “if” to a “when” discernment.

But, when faced with terabyte-scale fast data workloads, new considerations must be addressed when leveraging in-memory computing. One critical functionality is reliable persistence and fast data reload. In disaster recovery terms, this translates into a sub-second recovery point objective (RPO) and extremely short recovery time objective (RTO).

Fast Recovery for Fast Data

Financial Constraints, Technological Impediments

Traditionally, the goal towards achieving mission critical RTO and RPO has been implemented through high levels of redundancy (multiple data centers, database replicas). But, the hard cold reality for most businesses is that the cost of downtime (or losing hours worth of data) is less than the cost of maintaining another data center. Ultimately, this turned disaster recovery into a financial calculation: At what point does the cost of data loss and downtime exceed the cost of a fast data restore strategy that will prevent disrupting business continuity?
In addition, there’s another looming technical issue, specifically for real-time/in-memory workloads: In-memory data stores require the data set to be reloaded from disk to ensure both data consistency and predictable performance for front-end applications. If your data set is a few gigabytes, then data reload from a database or NoSQL data store might provide a reasonable RTO. The reality is, fast data workloads require both velocity (real-time analytics) and volume (terabytes of real-time and historical reference data). Rebuilding such volume from disk could take hours.

The Blurring Line Between Memory and Storage

Since it’s appearance in the 1980s, flash memory has become commercially successful for small devices requiring removable storage. The last decade brought significant performance advancements in flash technologies and prices have dropped. As price-performance ratios continue to become disruptive, flash storage and non-volatile memory are creating new opportunities for hybrid fast data stores. Such stores can converge access to data-in-motion and data-at-rest at low latencies.
Blurring line between memory and storage
One approach to exploit this disruption for fast data recovery is by tiering both memory and flash storage together in a hybrid storage in-memory data grid for mixed transactional and analytics workloads. The GigaSpaces In-Memory Computing platform with XAP MemoryXtend is an implementation of such architecture:
hybrid storage in-memory data grid
XAP MemoryXtend leverages XAP’s in-memory computing capabilities alongside an embedded persistent store optimized for fast storage and fast restart. In this architecture, any writes are synchronous until they’re written to a flash-optimized key/value store, while reads are either already in RAM (if they’ve just been read or written), or fetched from underlying flash storage upon read. In other words, it’s a classic LRU architecture that ties RAM and Flash.

XAP 12.1: Fast and Flexible Flash-to-RAM Data Reload

Prior to XAP 12.1, preloading data into the JVM heap was an “all or nothing” approach as follows:

Strategy #1: Lazy Load
Through this approach, no data is loaded into the JVM heap upon a restart. As read throughput increases from clients, most of the data will eventually load into the data grid RAM tier. This is a preferred approach when the volume of data persisted on flash far exceeds what can fit into memory. The JVM heap portions of the data grid partitions act as a LRU cache between RAM and SSD.

Strategy #2: Eager Preload 
Preloading everything from SSD/Flash into the IMDG memory partitions upon restart. This guarantees any subsequent read request will hit RAM, not preload from disk. The eager preload pattern is often used when the persisted data capacity equals what can fit into RAM. Also, when read latencies need to be predictable upfront. JVM heap warm-up, however, might be slow in case of a large dataset being preloaded.
Fast and Flexible Flash-to-RAM data reload
Strategy #3: Hybrid Lazy/Eager Preload, a XAP 12.1 innovation
Clearly, with the above choices, there’s a trade-off between how much JVM heap to warm up versus the time it takes the data grid to start accepting read/write operations. XAP 12.1 breaks this dichotomy by allowing a data preload strategy that can mix both approaches together. We’ve introduced the capability to select a subset of data, through SQL queries, of what should be preloaded into RAM. For instance, an e-commerce application may choose to preload all shopping carts into RAM, while keeping customer profiles and product catalog data on flash. The decision might also be time-based: preload all customer data generated within the last 24 hours, while keeping all remaining historical data on flash until it’s needed. Essentially, this approach promotes the fundamental guiding principle in multi-tiered data storage architectures: the time value of data is inversely proportional to the cost of storage or memory that it should exist on.

Hybrid Lazy/Eager Preload: XAP 12.1 Innovation

To sum it all up, the diagram below summarizes different suggested approaches to the hybrid preload strategy based on your case:

Through this combination of extending XAP with a fast recoverability data store, tens and even hundreds of terabytes can be immediately available for consumption by business applications at predictable latencies.

CATEGORIES

  • XAP
Ali Hodroj

All Posts (15)

YOU MAY ALSO LIKE

March 30, 2017

GigaSpaces Releases XAP 12.1: In-Memory…
4 minutes read

July 24, 2008

GigaSpaces R6.6-m1 Released
1 minutes read

January 27, 2012

Analytics for Big Data –…
9 minutes read
  • Copied to clipboard

PRODUCTS, SOLUTIONS & ROLES

  • Products
  • InsightEdge Portfolio
    • Smart Cache
    • Smart ODS
    • Smart Augmented Transactions
  • GigaSpaces Cloud
  • Roles
  • Architects
  • CXOs
  • Product Teams
  • Solutions
  • Industry
    • Financial Services
    • Insurance
    • Retail and eCommerce
    • Telecommunications
    • Transportation
  • Technical
    • Operational BI
    • Mainframe & AS/400 Modernization
    • In Memory Data Grid
    • HTAP
    • Hybrid Cloud Data Fabric
    • Multi-Tiered Storage
    • Kubernetes Deployment
    • Streaming Analytics for Stateful Apps

RESOURCES

  • Resource Hub
  • Webinars
  • Blogs
  • Demos
  • Solution Briefs & Whitepapers
  • Case Studies
  • Benchmarks
  • ROI Calculators
  • Analyst Reports
  • eBooks
  • Technical Documentation
  • Featured Case Studies
  • Mainframe Offload with Groupe PSA
  • Digital Transformation with Avanza Bank
  • High Peak Handling with PriceRunner
  • Optimizing Business Communications with Avaya

COMPANY

  • About
  • Customers
  • Management
  • Board Members
  • Investors
  • News
  • Events
  • Careers
  • Contact Us
  • Book A Demo
  • Try GigaSpaces For Free
  • Partners
  • OEM Partners
  • System Integrators
  • Value Added Resellers
  • Technology Partners
  • Support & Services
  • University
  • Services
  • Support
Copyright © GigaSpaces 2021 All rights reserved | Privacy Policy
LinkedInTwitterFacebookYouTube

Contact Us