Skip to content
GigaSpaces Logo GigaSpaces Logo
  • Products
    • InsightEdge Portfolio
      • Smart Cache
      • Smart ODS
      • Smart Augmented Transactions
    • GigaSpaces Cloud
  • Roles
    • Architects
    • CXOs
    • Product Teams
  • Solutions
    • Industry Solutions
      • Financial Services
      • Insurance
      • Retail and eCommerce
      • Telecommunications
      • Transportations
    • Technical Solutions
      • Operational BI
      • Mainframe & AS/400 Modernization
      • In Memory Data Grid
      • Transactional and Analytical Processing (HTAP)
      • Hybrid Cloud Data Fabric
      • Multi-Tiered Storage
      • Kubernetes Deployment
      • Streaming Analytics for Stateful Apps
  • Customers
  • Company
    • About GigaSpaces
    • Customers
    • Partners
    • Support & Services
      • University
      • Services
      • Support
    • News
    • Contact Us
    • Careers
  • Resources
    • Webinars
    • Blog
    • Demos
    • Solution Briefs & Whitepapers
    • Case Studies
    • Benchmarks
    • ROI Calculators
    • Analyst Reports
    • eBooks
    • Technical Documentation
  • Contact Us
  • Try Free

The InsightEdge Vision: Connecting Analytics to Impact

Subscribe to our blog!

Subscribe for Updates
Close
Back

The InsightEdge Vision: Connecting Analytics to Impact

Ali Hodroj March 28, 2016
5 minutes read

Data is born fast, but its insight value is often short-lived. It’s a challenge that many enterprises seeking to seize business moments are trying to solve. Whether it’s a financial services firm building a fraud detection system, a telecommunication provider alerting its users of extra charges based on location, or a retailer providing shoppers better offers in realtime as they browse their catalog.

Within the last decade, we’ve often looked at data from a storage and historical perspective. But, we live in a world of converged infrastructures and heterogenous inter-connected touch points that are creating a vast data footprint which demand extracting insight once data is born. Enterprises are now interested more in turning transient data into actionable insights (fast transactions, clickstreams, geo-locations, and sensors) to create transformational, even disruptive, business opportunities. Already, we are seeing a significant shift from focusing on accumulating data lakes at scale to analyzing data insights at speed. The latter demands new architectural approaches in fast data analytics which we aim to solve.

The Sub-second Data to Action Lifecycle Imperative

Based on our experience with GigaSpaces customers building extreme transaction processing systems, we are finding a growing number of use cases that focus on the intersection of popular sophisticated analytics frameworks (mostly Apache Spark) with transactional data sources under one unified solution. This eliminates both the cost of ETL-to-Hadoop bottleneck as well as operational complexity of integrating a real-time streaming analytics data pipeline with transactional applications. Here are some important trends that are driving this imperative:

Hyper-personalization and Omni-channel

Customer experience is a top priority to 75% of data executives (Forrester) and a fundamental driving force behind real-time analytics adoption. In order to provide a seamless and contextual user experience across all customer touch points (Web, Mobile, In-Store, Call Center), a typical retailer or financial services firm will need to converge the customer’s historical data with realtime transactions across online/offline data domains within a few seconds.

Analytics over Transient Data

In today’s world of real-time and high-throughput data generating applications, most of the data that we deal with has a short life value. Consider the three classes of fast moving data below and their insight value few seconds after creation versus few minutes or hours later. In all the instances below, it’s not feasible to move all the data generated by a particular source to a centralized data center or cloud for processing. It makes sense, from an insight-to-action latency perspective, to capture insight from the data before transmitting it to the center.

InsightEdge: Moving analytics to data, not the other way around

A growing number of big data processing platforms have gained traction in the last couple of years to solve Disk I/O problems of massive data workloads. Apache Spark is most notable for providing an immutable caching layer on top of NoSQL data stores to unify batch, streaming, and other complex analytics under one common API and data structure (RDD/Data Frames). However, we cannot address the challenges of connecting insight with action just by doing more of the same data processing techniques faster. Once we gain insight, how do we make it actionable in real-time?

Our experience with customers has shown that enterprises who rely on data insights as primary means for competitive differentiation are far more successful when their analytics workloads leverage a decentralized and distributed in-memory computing approach. Instead of collecting data through streams or ETL to a centralized data lake for post-processing, analytic workloads must run at the data source or network edge. To help enable this connection between analytics and business impact, we look at in-memory computing (beyond just caching scenarios) as a key architectural component that will round out the capabilities required for a modern fast data ecosystem.

In-memory data grids give us a way to process both transactional and analytical workloads at ultra-low latency and high-throughput, all while providing high availability and distributed processing across nodes, data centers, and clouds. Our goal is to combine the sophisticated analytics and ease of use of Apache Spark with a high-performance, ultra low-latency in-memory data grid that has been battle tested over the last decade across leading financial services, retail, telecommunication, and transportation institutions.

InsightEdge early release is here. Give it a go!

Quick intro to what InsightEdge is about:

High-performance Spark

We provide an implementation of all Spark API’s (Spark Core, SQL, Streaming, MLLib, and GraphX) on top of a high-performance, extreme transaction processing, in-memory data grid which leverages RAM and optionally SSD/Flash storage for low latency workloads. Our goal is to tier the storage and processing of Spark data and workloads between Spark workers and underlying data grid containers. This significantly eliminates disk, network, compute, and Spark memory management bottlenecks in complex analytics workloads.

Simplified Real-time Data Pipelines

Our technology provides a single unified cluster that combines Spark, polyglot data API’s (Objects, Geospatial, Documents, JSON…etc), and seamless connectivity with upstream (e.g. Kafka) and downstream (e.g. HDFS, Cassandra, MongoDB) data sources. We believe this is the fastest and simplest way enterprises can stand up streaming pipelines at the data source or edge.

Where to next?

Our first early access release of InsightEdge will be available to download from our website when Strata+Hadoop San Jose 2016 conference commences. While we are working towards a GA release in June 2016, you can learn more about InsightEdge and browse the documentation. Don’t forget to also follow us on Twitter, LinkedIn, or stop by to chat with us on Slack.

CATEGORIES

  • Fast Data
  • GigaSpaces
  • InsightEdge
  • Spark
Ali Hodroj

All Posts (15)

YOU MAY ALSO LIKE

March 8, 2011

Data Grid Querying, Revisited
6 minutes read

September 12, 2011

Mirroring GigaSpaces XAP to NoSQL
8 minutes read

December 17, 2008

Auto-Scaling and Self-Healing on Amazon…
2 minutes read
  • Copied to clipboard

PRODUCTS, SOLUTIONS & ROLES

  • Products
  • InsightEdge Portfolio
    • Smart Cache
    • Smart ODS
    • Smart Augmented Transactions
  • GigaSpaces Cloud
  • Roles
  • Architects
  • CXOs
  • Product Teams
  • Solutions
  • Industry
    • Financial Services
    • Insurance
    • Retail and eCommerce
    • Telecommunications
    • Transportation
  • Technical
    • Operational BI
    • Mainframe & AS/400 Modernization
    • In Memory Data Grid
    • HTAP
    • Hybrid Cloud Data Fabric
    • Multi-Tiered Storage
    • Kubernetes Deployment
    • Streaming Analytics for Stateful Apps

RESOURCES

  • Resource Hub
  • Webinars
  • Blogs
  • Demos
  • Solution Briefs & Whitepapers
  • Case Studies
  • Benchmarks
  • ROI Calculators
  • Analyst Reports
  • eBooks
  • Technical Documentation
  • Featured Case Studies
  • Mainframe Offload with Groupe PSA
  • Digital Transformation with Avanza Bank
  • High Peak Handling with PriceRunner
  • Optimizing Business Communications with Avaya

COMPANY

  • About
  • Customers
  • Management
  • Board Members
  • Investors
  • News
  • Events
  • Careers
  • Contact Us
  • Book A Demo
  • Try GigaSpaces For Free
  • Partners
  • OEM Partners
  • System Integrators
  • Value Added Resellers
  • Technology Partners
  • Support & Services
  • University
  • Services
  • Support
Copyright © GigaSpaces 2021 All rights reserved | Privacy Policy
LinkedInTwitterFacebookYouTube

Contact Us