In today’s digital data-driven world, enterprises are struggling to gain real time insights from their data. It’s even harder to leverage those insights at scale for instant business impact.
As the number of applications and the amount of data increases; ingesting, accessing and analyzing data becomes an ever-growing challenge. For data-driven digital transformations to be successful, a modern approach to integrating complex data ecosystems is required, which can help:
- Prevent back-end data systems from being overwhelmed with excessive workloads
- Decouple the data sources from the front-end API services
- Seamlessly extend existing architecture with analytics and machine-learning capabilities
Gartner discusses a proposed architecture called the Digital Integration Hub which is meant to address these challenges:
“API-based fast access to data dispersed across multiple sources is costly and needs notable integration work. Application leaders should implement a digital integration hub to enable high-scale access, minimize workload on systems of record and deliver additional value via use cases like analytics.”
Figure 1: Digital Integration Hub architecture as described by Gartner
To manage data as a valuable, strategic asset, the Digital Integration Hub delivers the ability to orchestrate, unify, govern and share data and analytical insights in a seamless manner.
Informatica and GigaSpaces Partner to Deliver an Intelligent Digital Integration Hub
The partnership between GigaSpaces and Informatica addresses today’s challenges by delivering an Intelligent Digital Integration Hub across cloud, on-premise and hybrid environments.
Figure 2: Intelligent Digital Integration Hub architecture
“What distinguishes the most successful businesses … is that they have developed the ability to manage data as an asset across the whole enterprise.”
“Data should support the many initiatives that are typically part of the digital transformation. For example, digital transformation usually involves the use of next-generation analytics platforms. How do I make analytics available to all the key people within the company, so they can develop predictive insights and so on? If you want to have that kind of widespread, next-generation analytics available, you need a data platform that can support that.”
McKinsey interview of Anil Chakravarthy, CEO of Informatica
The Intelligent Digital Integration Hub
The integration of GigaSpaces’ InsightEdge in-memory, real time analytics and machine learning platform adds the intelligence to Informatica’s Hybrid Integration Platform including: Integration Hub and Cloud iPaaS.
The solution also leverages Informatica’s Enterprise Data Catalog (EDC), powered by the Informatica CLAIRE™ engine, to provide machine-learning-based discovery to scan, catalog and detect data assets across the enterprise.
Consequently, the Intelligent Digital Integration Hub enables applications across the enterprise to leverage timely, trusted data and real-time analytic insights for faster, smarter business decisions.
“We are excited to partner with GigaSpaces to power and accelerate our customers’ digital transformation initiatives. Our integration with GigaSpaces InsightEdge enables our customers to reduce the cost and complexity of data and APIs while achieving high performance at large scale.”
Ronen Schwartz, Senior Vice President and General Manager, Cloud, Big Data, Data Integration, at Informatica
Together, GigaSpaces and Informatica Intelligent Digital Integration Hub is used to:
- Accelerate hybrid data integration and analytics on any data model including structured, unstructured and semi-structured data
- Orchestrate complex transactional and analytical processing with a persistence layer that can be used as a transient or as a sync layer
- Facilitate self-service consumption of data and insights
- Decouple source and target applications, allowing easy consumption by users and reduce load on sources’ systems
- Enable multi-latency integration with advanced orchestration and scheduling for batch and API’s driven data
- Support extreme performance and rich machine and deep learning capabilities; including Spark, numeric computing via Tensor and loading of pretrained Caffe or Torch models, as well as various NLP, OCR, Text Classification, Image Recognition and other libraries
- Integrate multiple clouds, new enterprise applications, in-memory speed layer and data lakes, with existing systems
- Provide visibility, control, monitoring, and alerting across all data workflows
Benefits for Enterprises
The Intelligent Integration Hub powers stronger, innovative enterprises as they continue to add new data-driven applications to remain competitive. By taming complex hybrid integration environments with extreme performance, agility, availability and the ability to deliver real-time actionable insights, the Intelligent Integration Hub provides a range of benefits for enterprises, particularly:
- Superb user experience with low latency response times, especially at critical peaks
- Optimized TCO, allowing the system of record applications to be planned for standard usage and not peaks
- High agility of microservices architecture for rapid development and deployment of new applications
- 24/7 always on services with market-proven solutions
- Real-time insights to action through analytics and machine learning run on live mutable data for time-sensitive decisions and actions
How Does the Intelligent Digital Integration Hub Work?
The Intelligent Digital Integration Hub uses the publish/subscribe model and allows self-service consumption of data by any user or application. InsightEdge, as a high-performance transactional and analytical processing speed layer, analyzes any data model (text, objects, documents, images, etc.), enriching the hub with event-driven rich machine and deep learning capabilities on both mutable streaming and hot data, simultaneously with historical data. These insights are instantly shared with applications and existing systems that have subscribed for the analytic results.
Figure 3: Intelligent Digital Integration Hub – Data Flow Example
- Loading of historical data set
- Training and validation of the machine learning model in InsightEdge
- Submitting a request for data and streaming via Kafka
- Triggering an event in InsightEdge and running the classification machine learning model
- Publishing of the analytics results (classification decision) received from InsightEdge
- Visualization of the live and historical data with a BI tool and/or via user application
The Intelligent Digital Integration Hub in Action
Use Case: Forecasting Credit Risk
Consider a financial institution or bank seeking to minimize risk and maximize profit when providing credit. A decision rule is required to determine whether a person will be able to pay back their loan and who should receive credit approval. This use case presents how the Intelligent Digital Integration Hub can be leveraged to forecast credit risk for loan requests in real-time.
InsightEdge Subscribes to Informatica Integration Hub
As a subscriber to the Digital Integration Hub, historical data is published to InsightEdge to train machine learning models (steps 1 & 2 in Figure 3).
The following is an example which uses historical data called “german_credit.” This data is published and subscribed to by InsightEdge, as well as various other applications for archiving and auditing as well as a mobile app agent:
Figure 4: Intelligent Digital Integration Hub – German Credit Topic Subscribers & Applications
A REST call must be made to retrieve historical data.
If privacy regulations require the anonymization of the data, the integration hub masking capabilities can be used.
Once a request is received (step 3 in Figure 3) for a loan from the application, InsightEdge receives the necessary data and triggers an analytics workflow. The reading and writing of data are performed using REST calls to request data from the hub.
The following is an example of the URL endpoint to subscribe on a topic called “german_credit”:
The data in this use case is from the German Credit Data and contains a multitude of attributes structured as a table, including credibility, the account balance status of the existing checking account, the duration of credit in months and the history of previous credits to the purpose of the credit requested, the savings account/bonds, and personal status and gender and more.
The topic structure screen for the German Credit Data appears as follows in the Intelligent Digital Integration Hub:
Figure 5: Intelligent Digital Integration Hub – German Credit Topic Structure Used for Training
Training and Running the RandomForestClassifier Model
The following is an example of the code for training the Random Forest Classification model from a Zeppelin notebook:
The accuracy of the Random Forest Classifier without cross-validation resulted in 78.9% accuracy; and with cross-validation, an accuracy of 99.5% was achieved.
Once the model is trained and stored in InsightEdge, it can be run on live data to generate predictions, as shown below:
Publishing the Results
The insights gleaned from InsightEdge are published via the Intelligent Digital Integration Hub to the applications that have subscribed to receive them (steps 5 & 6 in Figure 3). In this use case, both the mobile app and Tableau receive the decision for the loan request.
Figure 6: Processed vs. approved credit requests as seen on Tableau
Range of Use Cases
The synergetic partnership between real-time analytics and data management market leaders enables the running of real-time analytics and machine learning on any enterprise’s streaming, hot and historical data, and instant publication of the insights to the subscribed application. This powers a range of use cases from a variety of industries:
- Finance: Fraud detection, intra-day risk analysis, credit risk forecasts, cash reserve predictions, Customer 360
- Insurance: Usage-based insurance, live risk analysis, Customer 360, customer churn
- Retail/eCommerce: Dynamic pricing, personalized recommendations, intelligent inventory management, Customer 360, location-based promotions
- Transportation: Dynamic pricing, predictive maintenance, fleet management, route planning
- Industrial IoT: Predictive maintenance, supply chain management, inventory planning
- Telco: Intelligent call center routing, cyber and DDoS attack detection, Data Center Infrastructure Monitoring (DCIM), predictive maintenance, network health monitoring
See the Intelligent Digital Integration Hub in Action
The joint solution will be presented at the “AI and Cloud Innovation Zone ” during Informatica World 2019 taking place May 20-23 in Las Vegas, where conference attendees can see a live demo, including Apache Spark, Apache Kafka and Tableau on AWS, showing how data-powered cloud innovation enables credit risk forecasting.