The Rise of In-Memory Computing
Adoption of In-memory Computing, also known as IMC, is on the rise. This can be attributed to the growing demand for faster processing and analytics on big data, the need for simplifying architecture as the number of various data sources increases, and technology enhancements that are optimizing TCO.
In Forrester’s report for enterprise architecture professionals, the case for In-Memory Computing is clear:
“Data is the lifeblood of your entire organization. It should enlighten every function of the business, including CX, operations, marketing, sales, service, and finance.A data management (DM) strategy is critical. The goal should be clear: Provide all business functions with quick and complete access to all of the data and analytics that they need, both now and in the future.”
Gartner further reinforces this in its In-Memory Computing Technologies 2018 Predictions, noting that:
“The digitalization of business generates an inexhaustible demand for faster performance, greater scalability and deeper real-time insight, which is boosting innovation around IMC technologies.”
The fact that In-memory Computing is the preferred technology for unifying transactional and analytical processing for real-time insights and closed-loop analytics is also driving IMC innovation and growth. In the Market Guide for HTAP-Enabling In-Memory Computing Technologies, Gartner states that:
“Enabled by IMC technologies, HTAP architectures support the needs of many new application use cases in the digital business era, which require scalability and real-time performance. Data and analytics leaders facing such demands should consider HTAP solutions supported by IMC.”
Gartner’s HTAP, also called HOAP (hybrid operational and analytical processing) by 451 Research and Translytical Platforms by Forrester, enables organizations to carry out analytics on incoming transactions, taking advantage of the transaction window and triggering instant actions.
Furthermore, the availability of SSD and persistent memory technologies are also driving costs down. Consequently, IMC platforms that support these data storage tiers intelligently are helping organizations optimize their TCO while delivering in-memory performance.
Why In-Memory Computing?
To maintain a competitive edge and meet today’s demands for optimal customer experience, enterprises must deal with the constant upsurge of available data and the never-ending demands for better and faster performance. This is firing the development of In-Memory Computing technologies. Because In-Memory Computing is all about how much data can be ingested and analyzed, and how fast the analysis can be performed.
In-Memory Computing has evolved because traditional solutions, typically based on disk storage and relational databases using SQL query language, are inadequate for today’s business intelligence (BI) needs – namely the provision of super-fast computing and scaling of data in real-time.
In-Memory Computing: Basic Principles and Significance
Figure 1: In-Memory Computing Basic Principles and Significance
In-Memory Computing is based on two main principles: the way data is stored and scalability – the ability of a system, network or process to handle constantly growing amounts of data, or its potential to be elastically enlarged to accommodate that growth. This is achieved by leveraging two key technologies: random-access memory (RAM) and parallelization.
High Speed and Scalability: To achieve high speed and performance, In-Memory Computing is based on RAM data storage and indexing. This results in data processing and querying at more than 100 times faster than any other solution, delivering optimal and uncompromised performance and scalability for any given task.
For scalability – which is essential for big data processing – In-Memory Computing is based on parallelized distributed processing. In contrast to a single, centralized server managing and providing processing capabilities to all connected systems, distributed data processing offers a computer-networking method in which multiple computers across different locations share computer-processing capabilities.
Real-time Insights: In-Memory Computing allows for the collocations of business logic, analytics and data that can be ingested from multiple sources (multi-model store). In this way, In-Memory Computing is much more than just producing an analysis much faster than before; it’s about becoming predictive in analysis itself!
By simultaneously addressing massive amounts of streaming, hot and historical data (as in GigaSpaces’ solution), In-Memory Computing supports the running of real-time advanced analytics and machine learning for instant insights that are immediately leveraged by collocated business logic with the memory fabric. When something happens that can affect business operations, customers’ actions, regulatory compliance and more, an immediate understanding of the impact and consequences are made available, enabling the provision of an appropriate, real-time response and decision-making.
Furthermore, continuous predictive analysis leveraging the ability to ingest millions of events per second and analyze the data, prevents undesired occurrences from equipment breakdowns, customer churn, cyber attacks and more.
Abundant Range of Use Cases: In-Memory Computing is, of course, applicable for companies dealing with large volumes of data, particularly when there’s a consumer dimension such as retail, financial services, insurance, transportation, telco and utilities. Typical examples include risk and transaction management in banks/financial institutions, fraud detection for payments and in insurance, trade promotion simulations in consumer product companies and real-time/personalized advertising. But In-Memory Computing is applicable for any industry or market where real-time analysis, insights and predictions based on streaming and historical data offer business value, such as geospatial data analysis, predictive maintenance and route optimization in transportation.
Enabling Technology: Many of today’s applications and technologies would not be possible without the integration of In-Memory Computing. Typical examples of this enabling technology role include applications implementing blockchain technology (which allows digital information to be distributed but not copied), or applications involving geospatial/GIS processing for transportation (such as real-time directions on traffic congestion, recommended routes and traffic hazards).
The Hybrid Transactional and Analytical Processing (HTAP) Use Case
In contrast to the traditional computing paradigm of moving data to a separate database, processing it and then saving it back to the data store, with In-Memory Computing everything can be placed in an in-memory data grid and distributed across a horizontally scalable architecture. This is accomplished at low latency, because the disk I/O that prevents workloads and mixed heterogeneous workloads from happening in real time has been eliminated.
The In-Memory Computing concept powers a new unified paradigm. Instead of separating transactional databases from analytics databases, which leads to disk I/O and disk bottlenecks, working with in-memory data stores enables the easy elimination of bottlenecks and handles mixed workloads within the same architecture.
To leverage the full value of the unified analytical and transactional processing paradigm, In-Memory Computing platform must support:
- Applications with polyglot persistence (microservices, multiple data sources).
- Analytics that are mostly real-time streaming and predictive, in addition to historical reporting.
- Data science with modeling against live data for continuous machine and deep learning. This is based on an intuitive fail-fast/recover-fast workflow model that needs to be as close as possible to transaction data in order to act in the moment and adjust properly.
In-Memory Computing: GigaSpaces’ Approach
Figure 2: Customer Challenges that are addressed with GigaSpaces In-Memory Computing Platforms
At GigaSpaces, we’ve been delivering In-Memory Computing solutions for over a decade, powering mission-critical applications for leading enterprises worldwide. We’ve taken our core In-Memory Computing competence and developed the fastest in-memory real-time analytics platform, InsightEdge. This platform is geared for operationalizing machine learning and transactional processing, at scale, running real-time analytics and machine learning models on streaming data as it’s born, hot data and historical data, for instant insights to action. In this way, InsightEdge supports insight-driven organizations in addressing time-sensitive decisions and enhances business operations, regulatory compliance and customer experience.
“GigaSpaces has taken a differentiated approach with its combination of the core data grid/cache functionality and Apache Spark to provide high-performance data ingestion, as well as a unified interface for both batch and real-time analytics. While the company is also not alone in its support for Kubernetes, support for multi-region clusters via its WAN Gateway functionality is also a differentiator.”
Matt Aslett, VP Data, AI and Analytics
The InsightEdge software platform contains all the necessary frameworks for scalable data-driven solutions, including SQL, Spark, streaming, machine learning and deep learning. Applications leverage faster and smarter insights from machine learning models running on any data source – whether structured, unstructured or semi-structured – while seamlessly accessing historical data from data lakes, Amazon S3, Microsoft Azure Blob Store and others.
Figure 3: Advanced Analytics, Machine Learning and Intelligent Data Storage Tiering for TCO, BI Dashboard and Visualization Support
InsightEdge is a cloud-native, microservice-based architecture for cloud, on-premise or hybrid environments; and supports intelligent, multi-tiered storage across RAM, SSD, Storage Class Memory and Persistent Memory. The simplified architecture not only enables keeping up with “millisecond industries”; it also reduces TCO and data movement complexity by radically minimizing the number of ‘moving parts’.
In-Memory Computing: InsightEdge Benefits
This integration of In-Memory Computing in InsightEdge offers a range of benefits, from instant, smarter insights and extreme performance to TCO optimization and mission-critical 99.999 availability.
Instant, Smarter Insights
- Unlocked as data is born, enriched with historical data, empowering time-to-analytics-to-action at sub-second scale.
- Event-driven analytics and co-located business logic trigger analysis and action at exactly the right time.
- Seamless and faster access to historical data on data lakes
- Data consistency (ACID compliant)
- Predictive analytics from SQL, streaming, machine learning through Apache Spark and deep learning with Tensor Flow and other frameworks.
- BI tool integrations, including Tableau, Looker and PowerBI
- Ultra-low latency, high-throughput transactions and stream processing supporting millions of IOPS.
- Co-location of applications and analytics to act on time-sensitive data at millisecond performance.
- Auto-scaling any tier on peak load on-premise or any cloud
- Eliminates data movement complexity and simplifies data governance, radically minimizing the number of moving parts and reducing TCO.
- Cloud-native and multi-cloud– infrastructure agnostic solution deployed on cloud, on-premise or hybrid environments.
- Controls data by customizing multi-tiered storage preferences across RAM, SSD and Persistent memory to optimize business results and hardware costs.
- Out-of-the box ETL, intelligently controlling data movement from speed layer to the historical data store.
Mission Critical 99.999 Availability
- Mature, battle-tested platform.
- Highly available with up to 5-nine reliability, auto-healing and zero single point of failure.
- Geo-redundancy, fast data replication and native persistence for immediate recovery.
In-Memory Computing: Case Studies
The use cases for the implementation of In-Memory Computing are extensive. The following are a few examples from our customers.
The Customer: A top 10 European banking and financial services company focused on retail banking, corporate investments and global investments and active across Europe, the USA and Asia-Pacific.
The Need: Calculate risks with extremely low latency (sub-second); improve overall performance of hundreds of applications in multiple geographies; achieve ‘full’ automation, consistency and resiliency; and reduce manual intervention to a minimum.
The Challenge: The ingestion of millions of data objects per day and the provision of about half a billion reads with over 100 million notifications; the distribution of referential data from heterogeneous sources to client applications using standardized modeling while preserving data consistency; and the provision of standard services..
Figure 4: Implementation of InsightEdge Through a Worldwide Platform Deployed in Paris, New-York and Hong-Kong
- Access to data in less than 1 second
- Sub-second performance for users across global sites.
- Data consistency for hundreds of services, cache and web applications in multi-geographies.
- Zero downtime
The Customer: A leading transportation company focused on rail transportation and real estate, supplying rail-based freight transportation services.
The Need: To run predictive analytics on equipment while ingesting and processing streaming data at scale, in order to monitor and diagnose potential failures to support operators in reducing maintenance costs, improving troubleshooting and redirecting trains in a timely manner.
The Challenge: Process streaming data at scale and query from a live data mart; event-driven analytics and business logic; multiple small, low-volume streams requiring correlation and statefulness (the IoT streaming problem); and real-time analytics leveraging GPS and train sensor data.
Figure 5: Integration of InsightEdge to Ingest and Process Streaming Data from Millions of Sensors, Provide Real-time Insights and Respond Instantly to Situations
- Big data pipeline simplified by high-performance, high availability stream ingestion and processing from multiple sources
- Safety and reliability of journeys improved and cost-saving measures that conserve fuel and increase safe operating speeds achieved by leveraging machine learning on real-time and historical data from train events, fence events and GPS.
- Timely maintenance and redirection of trains through event-based triggers which direct the output to operational workflows and live dashboards.
The Customer: PriceRunner, a leading European shopping comparison site which has 4.4 million unique visitors and receives prices from 18,00 different merchants, every month.
The Need: To ensure the provision of real-time price comparisons, particularly at high peak periods such as the night before Black Friday when traffic increases by up to 20 times normal traffic, ensuring customer experience.
The Challenge: Support scalability requirements for peaks such as Black Friday, without compromising performance; no downtime; real-time analytics on transactional data; event-driven applications powering integrated applications; and microservices architecture for rapid development and deployment.
Figure 6: InsightEdge Provides a Load-balanced Environment that Flexibly and Quickly Scales Out without Compromising Performance and Speed
- Scalability while retaining performance – at peak traffic spikes of 20 times normal traffic and updating 100 million prices in parallel, response time rose from 5ms to 8ms for the 95th percentile of the product page endpoint.
- Rapid development and deployment of multiple services through microservices architecture integrated with a big data ecosystem.
- Service never down, ensuring high levels of customer experience and powering PriceRunner’s leadership.
In-Memory Computing is one of the key buzzwords in enterprises and organizations today. That’s because it’s revolutionizing data analysis by vastly speeding-up computing and scaling to never-ending quantities of data.
For any enterprise or organization, implementation of In-Memory Computing allows for much broader and more advanced HTAP use cases. It can be implemented in point-of-decision HTAP (transactional and analytical applications sharing the same data infrastructure) and in-process HTAP (transactional analytics guided by real-time analytics).
Now’s the time for enterprises and organizations to utilize the massive quantities of data they have at their fingertips. By integrating In-Memory Computing they will be able to:
- Accelerate machine learning on their real-time and historical data for smarter insights with sub-second response time, at scale.
- Simplify development and deployment of their cloud-based and in-premise applications for faster time-to-market.