I’m almost sure that when Da Vinci said “Simplicity is the ultimate sophistication,” he wasn’t speaking about real-time AI, machine learning and transactional processing. And yet his words are ever so relevant when we think about overcoming the complexities of big data and analytics that enterprises are facing today.
Our customers are telling us that getting insights from their data is complex; normally leveraging batch analytics, yielding results, but not effectively addressing time-sensitive business decisions. Obtaining smart, real-time insights and acting on them within a critical time frame is even more complex, and getting that done at scale? Well, that’s just one more hurdle to overcome.
GigaSpaces has been addressing a growing and common theme across all of our customers, whether they’re in finance, insurance, retail, eCommerce, telecommunications,transportation or from other industries, which is the need to shift towards using real-time (hot) data to make real-time or near-real-time business decisions. This shift is happening as organizations begin to converge their operational systems with their management systems, with the goal of becoming more efficient and data driven by pulling actionable insights at the same time that data is born.
The key is simplicity, and this is one of the main tenets of InsightEdge release 14.0.
Smarter, Faster Insights – Based on Real-Time Data in Any Format, Enriched with Historical Context
Insight-driven organizations are beginning to run real-time AI and machine learning models for the instant insights and actions needed for applications such as fraud detection, location-based advertising, personalized offers, dynamic pricing, dynamic risk analysis, predictive maintenance and others. Accurate analytical results require access to both real-time and historical data from various sources (structured, semi-structured and unstructured).
Data from multiple sources is ingested, and the hot data is enriched with cold (historical) data stored in data lakes such is S3 and Hadoop in a faster, simplified manner. InsightEdge contains all the necessary SQL Spark, Streaming, and Deep Learning toolkits for scalable, real-time, data-driven solutions. The platform offers ultra-low latency, high-throughput transaction and stream processing and co-location of applications and analytics to act on time-sensitive data–as it is born–at sub-second performance.
InsightEdge Platform and XAP 14.0
In this release of InsightEdge and XAP, we continue to deliver on our goal to simplify the complexities of big data and analytic workloads, to free our customers to develop and deploy the innovative applications they require for time-sensitive mission critical services.
Newly Available Features and Functionality
- Kubernetes environment – for simplifying deployment in cloud, on-premise and hybrid environments
- Intel® Optane™ DC Persistent Memory support (MemoryXtend PMEM driver) – disruptive storage layer supported with the MemoryXtend module for significant TCO savings, increasing capacity with near in-memory performance.
- New hot-swap capability for rebalancing policy support.
- Data visualization with Apache Zeppelin for InsightEdge users.
Kubernetes: One Click, Any Cloud, Always-On
In the four years since the Kubernetes project was first introduced, it has become the de facto standard for containerized application deployment. Kubernetes can be used in any type of environment–on-premise, cloud, or hybrid–and is supported by every major cloud provider, including public cloud platforms like AWS, Microsoft Azure, GCP and others.
GigaSpaces’ InsightEdge Platform and the XAP data grid both support the Kubernetes environment. Benefits include:
- The ability to deploy GigaSpaces-based applications in whatever environment best suits the business needs of the enterprise.
- Kubernetes synergizes with InsightEdge and XAP, simplifying the operationalizing of machine learning and transactional processing at scale.
- InsightEdge and XAP utilize key features of Kubernetes, such as cloud-native orchestration automation with self-healing, cooperative multi-tenancy, and RBAC authorization.
- Auto-deployment of data services and deep learning and machine learning frameworks, such as Apache Spark jobs, stateful Processing Units, as well as the Apache Zeppelin web notebook.
Easy, Seamless Deployment with Helm
Helm, the Kubernetes package manager, is used for installing InsightEdge and XAP in the Kubernetes environment. Helm makes deploying complex applications more portable, supports automatic rollbacks, and is a familiar pattern for developers that is easy to understand.
The GigaSpaces Helm charts are published in a dedicated Helm chart repository. You can unpack the relevant Helm chart locally, and use the following command, for example, to start XAP:
helm install xap --name helloworld
Read more about how to install InsightEdge and XAP in Kubernetes on our documentation website.
Kubernetes: You Choose – Service Grid or Cloud Native with Kubernetes
Kubernetes utilizes Docker containers, which eliminates the need for a Grid Service Container. The Kubernetes deployment of InsightEdge and XAP further leverages the Kubernetes cluster management approach by swapping out Service Grid components for pod architecture, including:
- Management Pod: Replaces the Grid Service Agent and the XAP Manager.
- Data Pod: Analogous to the Processing Unit instance in the platform.
- Driver Pod: Contains the Spark Driver, which creates Spark Executors, connects to them, and executes the required application code.
- Executor Pod: Contains the Spark Executor, which runs the Spark job on the data in the Data Pod.
- Zeppelin Pod: Contains the Apache Zeppelin web-based notebook.
Kubernetes: Enhances our Always-On Promise
InsightEdge and XAP leverage anti-affinity rules to ensure that the primary and replica are always on separated physical machines within the cluster. Kubernetes’ Stateful Sets are also utilized, so each pod has a persistent identifier that is maintained across any rescheduling. These and other self-healing, load balancing and fast-load mechanisms ensure no downtime and no data loss.
Support for Tiered Storage
Customers can use the MemoryXtend off-heap storage driver to configure data prioritization per application’s business logic, to ensure that the most important data resides in the fastest data storage tier for optimized TCO.
Intel® Optane™ DC Persistent Memory: More Capacity, Lower Cost, No Compromises
Intel recently unveiled its new Optane DC persistent memory (PMEM) modules, offering data centers what they describe as an entirely new class of memory and storage technology that offers the unprecedented combination of high capacity, affordability and persistence.
GigaSpaces is among a select group of Independent Software Vendors (including Amazon, Google, and Microsoft) that were chosen by Intel to partner with them on integrating its Optane DC technology in time for its introduction to the market.
As a result of this partnership, GigaSpaces now offers our newest PMEM driver for the MemoryXtend module. Combining the strength of Optane DC Persistent Memory and GigaSpaces, InsightEdge provides customers with the required performance and TCO optimization for uncompromised business results:
- In-memory extreme performance at at a significantly lower cost.
- Large reduction in the number of servers required, which reduces footprint, power, maintenance, network and other overhead costs.
- Optimized manageability via less networking, data movement, and security configuration.
- Smarter, faster insights and actions allowing machine learning models to run on a more complex feature vector, provide more accurate inferencing and scoring, and include historical context in the real-time decision/business logic
- More customization opportunities and flexibility according to application priorities are available with InsightEdge’s business-driven, intelligent MemoryXtend tiered storage module.
- Instant recovery time as there is no need for an additional persistence layer.
TCO and Access Rate per Storage Tier
Hot Swap – Rebalancing with Zero Downtime
Our new demote capability in the REST Manager API, which supports both InsightEdge and XAP, makes it easier, faster, and simpler to rebalance a system after significant environment change scenarios, such as failover or scaling. In the past, the only way to demote a primary instance in order to rebalance a system was to force a restart. Now customers can write their own rebalancing policies that take advantage of the ability to perform a hot swap during runtime, without having to reload the data.
Native Data Visualization Tool
InsightEdge now offers a JDBC-based interpreter for Apache Zeppelin. Customers can use this new InsightEdge interpreter to access data directly from the data grid alongside the Spark interpreter. Queries to the data grid using the InsightEdge interpreter are faster and lighter, and use significantly less system resources. This provides incredible value for analysts and developers, who can now visualize Space data on the fly within the Zeppelin notebook.
Space Data Visualized in the Apache Zeppelin Web Notebook
The release of InsightEdge 14.0 presents our customers with significant tools for simplified deployment; whether it’s using Kubernetes, leveraging new rebalancing policies, faster visualization or the ability to drastically reduce TCO.
Da Vinci’s words “Simplicity is the ultimate sophistication,” continue to inspire GigaSpaces and our product roadmap as we power our customers digital-transformation to become insight-driven to maintain their competitive edge.
Read more about our integration with Intel Optane DC Persistent Memory on Intel AI Builder.