Latest release significantly reduces memory footprint and infrastructure costs, and boosts digital application performance.
2020 has been a tough year all around. With tight budgets and limited resources, enterprises are looking to optimize infrastructure TCO. At the same time, they are looking to accelerate digital transformation. According to the 2021 Gartner Board of Directors Survey, 69% of boards of directors accelerated their digital business initiatives following COVID-19 disruption. Such digital business initiatives require the speed, scale and agility to handle the ever growing amount of business data.
With this in mind, I am excited to announce today the release of GigaSpaces v15.8 which offers advanced functionality to significantly reduce memory footprint, reduce infrastructure costs, and boost digital application performance. The main three pillars of InsightEdge v15.8 include:
- Reducing RAM footprint: optimizing in-memory data store RAM to reduce hardware costs by up to 70% while retaining blazing performance
- Boosting SQL query performance with smart data locality: 10x faster response time on reporting and BI compared to previous releases
- Cloud native lifecycle management: enable agile deployments of new versions of data services with no system downtime
Smart RAM Footprint Reduction to Save up to 70% on Infrastructure Costs
With v15.8, you can now optimize any object by simply marking it as “Storage Optimized”. This will automatically reduce the RAM storage footprint the object requires. The degree of optimization depends on the ratio of indexed properties to unindexed properties (fields). The more unindexed properties you have, the bigger the reduction will be.
Consider a BI dashboard that displays data records with 100 properties, when only 20 properties out of the 100 are indexed. Selecting “Storage Optimized” for all unindexed properties will reduce RAM utilization by up to 70%.
Why is it important? It means significant cost savings. Here’s an example. Let’s assume 1TB of RAM costs $10 an hour (for real life pricing options, see for example AWS EC2 on-demand pricing). This amounts to $7K monthly, or $86K annually. Backup partitions would double the cost to $173K, and if you have a remote disaster recovery data center for high availability, this will set the cost at $346K. Assuming RAM footprint reduction of 50%, you will save $173K annually for every 1TB of data. This cost savings can add up fast if you utilize more than 1TB of data, or if you add additional clusters, such as a cluster in NY and a cluster in London.
Below are benchmark results that present the expected footprint reduction in various ratios of indexed properties. This benchmark is based on 100k objects, with 100 fields of type string, length 10.
When optimizing the RAM footprint, the impact on performance is as seen below. The difference in latency is between 1-2 milliseconds on remote operations:
For optimal tuning, the user can select which objects to optimize and which not, trading storage and performance as required. Early Access to the storage optimization feature is available today.
Boosting Query Performance by More than 10X
InsightEdge now allows you to boost query performance with smart data locality using Broadcast Objects. An Object can now be designated as a “Broadcast Object” with a single click.
This enables server-side JOIN performance by automatically replicating selected small tables of data to all the nodes in the cluster. In other words, it gives you the flexibility to balance storage footprint and performance, and can improve your reporting and BI performance by 10x. The scenarios that utilize the Broadcast Objects feature in the most optimized way are cases where you JOIN two tables, when one table is a large dynamic table of transactions, while the other table is a small static table that does not change frequently, such as daily exchange rates. The small static table will be replicated to all nodes, but being small it will have a minor impact on RAM footprint. This will allow local JOIN operations, significantly reducing network overhead and resource utilization, leading to low latency and higher concurrency.
Let’s take a real life example. A hedge fund was querying a large table of live stock quote records, JOINed with three other static tables with additional information about the data source and the equity. When the four tables were independent sharded tables, the response time was too slow for their needs.
They then designated the three reference tables as Broadcast tables, leaving only the Quote table as an independent partitioned table. The performance improved dramatically, and the queries ran 12x faster. The standard deviation also dropped, leading to more predictable performance. You can see the results in the following table, when running 50 concurrent users:
Cloud Native Lifecycle Management
With v15.8, GigaSpaces adds support for Kubernetes Operator to provide full lifecycle management for your data applications. This allows organizations to use Kubernetes Helm for day-1 deployment in a cluster, then use Kubernetes Operator for day-2 management tasks. It allows deployments of new data services or versions with business logic to production without any downtime. It also allows for auto scaling up or out to support unexpected workloads.