It has become almost cliché to note that major companies, across an array of industries, are trying to leverage big data for insights into their businesses.
However, as technology progresses and analytics capabilities increase, a paradigm shift is underway toward a more holistic system of engagement using big data, with a fresh look at analytics for improving business performance, operational efficiency and customer experience.
We’ve noted 5 major trends at the intersection of big data and analytics that we believe are going to continue shaping 2020:
- Hybrid and multi-cloud architecture
- “Translytical” systems, combining transactional operations and analytics
- 360-degree views of data, for continuous business intelligence and machine learning
- Big data and fast data at scale, meeting dynamically changing demands
- Operationalizing Machine Learning with MLOps
Let’s drill down a bit into each of these.
Hybrid and multi-cloud architecture
Most enterprises have recognized the obvious advantages of cloud data services, with their flexibility, scalability, and location-agnosticism. We’re finally seeing the leading banks, telecommunications firms, and other large organizations start making substantial but incremental moves to leverage the cloud for more and more applications.
Responding to this trend, and promoting it, the 3 major public cloud providers – Google, Amazon, and Azure – have introduced within the past half-year cloud infrastructure for frictionless hybrid or multi-cloud services. The common idea behind all three (Google Anthos, Amazon Outposts, and Azure Arc) is to provide consistency across on-premises and cloud environments – making the transition as smooth as possible.
Anthos marks Google’s official entry into the enterprise data center. It is one of the first official multi-cloud platforms from a mainstream public cloud provider. At the heart of Anthos is the most popular open-source project of our times – Kubernetes, using Google Kubernetes Engine (GKE) as its foundation.
In November 2019, Microsoft also announced a slew of new hybrid cloud products and services. The most significant announcement being Azure Arc, a hybrid and multi-cloud platform
Bringing up the pack, Amazon Web Services who encourages customers to go all-in on the cloud, finally made their adaptation last December when they announced AWS Outposts – which allows customers to run their applications not only on AWS cloud, but also in their private legacy data centers, using AWS storage, analytics, and more. Customers can choose from Native AWS, and VMware Cloud on AWS.
Similarly indicative of the trend is the development (and IBM acquisition) of RedHat Openshift, providing the tools needed for managing hybrid and multi-cloud deployments on both public or private clouds. The popularity of Kubernetes has indeed helped power the notion of seamless hybrid and multi-cloud deployments.
Figure 1: GigaSpaces supports, on-premise, cloud, and hybrid environments
A hybrid environment is indeed an excellent way to transition from on-premise to cloud, but it can also be a solution of choice, in cases when sensitive data –such as financial, health, or other personal information – is required to remain on-premise, but different information can leverage the benefits of the cloud.
On the other hand, a multi-cloud and intercloud architecture strategy reduces the reliance on a single vendor, promotes greater cost-efficiencies, and aids in the adherence to regional laws or policies. For large organizations, a multi-cloud environment can also provide geographical distribution of processing requests, reducing latency, and establishing resilience for mitigating the impact of major IT disasters.
Figure 2: GigaSpaces support multi-cloud deployments with its WAN Gateway module and Red Hat Openshift Operator certification
Moving to multi-cloud and hybrid environments, require sophisticated orchestration and data management capabilities which are promoting a more agile environment and driving a plethora of new services.
Making real-time decisions on transactional/operational data for time-sensitive services and applications requires the unification of operational and analytical systems. This need has given birth to emerging modern enterprise architecture, combining OLTP (CRM, ERP, billing, etc.) with OLAP (data lake, data warehouse, BI, etc.) systems.
Figure 3: The transition from the traditional siloed architecture to the unified analytical and transactional processing platform promotes modern data-driven business processes and applications
A hybridized transactional-analytical data processing platform (also referred to as analytical data platforms, augmented transactions, HTAP) opens the door for novel operational applications, such as:
- Dynamic pricing
- Hyper-personalized recommendations, content, and offers
- Real-time fraud analysis
- Real-time business process optimization
- Predictive maintenance
The translytical trend is especially evident, with the rise of digital banking. Swedish Bank SEB, for example, announced late last year that it is adopting a fully digital cloud-based data platform for core banking. Such functionality offers huge operational cost savings and gives customers the personalized digital services they want.
Not long ago, Gartner summarized the operational effects of translytical processes as real-time analytics, greater situation awareness, and simplified architecture. In my view, the corresponding business impacts are data-driven improvements in customer experience, information monetization, and operational efficiency.
For a comprehensive review of translytical data platform providers, take a look at The Forrester Wave™: Translytical Data Platforms Report 2019 report.
360-degree views of data
One of the most difficult challenges in large enterprises is getting a unified view of your business, including customers, products, orders, billing and suppliers, unstructured data, transactions, and more. Adding to the complexity is the need to understand the interdependencies of all those elements, as well as to track and integrate real-time and historical data across the lifecycle of a product, across multiple channels, and throughout a customer journey.
To add another wrinkle, such data is typically spread across up to ten different systems, using a variety of data platforms: Oracle DBs, IBM DB2, Cloudera, Amazon S3, Azure Data Lake Storage, and more.
One of the major undertakings in 2020 will be finding solutions that provide a 360-degree view of business data across multiple sources. In November, 2019, Salesforce announced its Customer 360 Truth to connect, authenticate, and govern customer data and identity across Salesforce.
Just think about the impact of connecting data for all critical business processes and applications. A 360-degree view of business data is becoming critical to meet the expectations of modern users for highly flexible and interactive queries, to provide comprehensive business intelligence, and to ensure effective machine learning.
Big data and fast data at scale
In traditional industries, such as finance, insurance, and automotive, AI initiatives are usually driven by one or all of these main considerations:
- Operational costs
- Customer experience
- Regulatory adherence (open banking, PSD2 for peer-to-peer payment systems, WLTP in the EU automotive sector, etc.)
The PSD2 directive that came into force in January 2018 meant that Open Banking and the use of open APIs would enable third-party developers to build apps, websites, and services based on the data from banks and financial institutions. Since PSD2, Open Banking and the buildout and implementation of APIs to service this new concept began to increase.
Following in the footsteps of companies like Paypal and Google, banks that began experimenting early on with APIs and collaborating with third-parties included big names such as BBVA, Citibank, and Capital One.
Another example of a company that was built on the basis of the Open Banking regulation is Tink, which provides a platform used by PayPal, NatWest, BNP Paribas, ABN Amro, and others, which has raised €90M in 2020. Tink’s open banking solution, provides developer APIs that enable banks and other financial service providers to leverage the o PSD2 regulation.
The Open Banking regulation requirements become even more pressing when the organization must handle big data and fast data surges and dips – while maintaining service levels and without incurring significant costs.
Peak events can be repetitive and predictable, but they can also be volatile and sudden. In addition, they can include intraday changes that need a response in real-time.
A business that goes online for the first time may suddenly need to handle entirely different magnitudes of queries to its system; yet, it cannot completely replace its entire infrastructure to do so.
The challenge is how banks can continue to meet data processing adherence requirements while being able to dynamically scale up and down based on changing load – at an acceptable cost.
These challenges of scale will continue to drive innovative solutions and cloud-based options going forward.
Operationalizing machine learning with MLOps
One of the most important and transformative trends underway is the move from reactive and interactive analytics to proactive and predictive analytics.
Figure 4: The progression of analytics and business benefits
AI-driven technological advances are driving big data and insight-driven transformation and business intelligence initiatives, including increasing automation of routine actions and more forward-looking analytics. The trend with AI optimization is to move from diagnostic or predictive analytics toward the prescriptive and proactive, in an effort to answer the questions of how the data indicates you can achieve the best outcomes and what can be done proactively to reach that goal.
However, many organizations are struggling to leverage their big data with AI and machine learning-powered analytics solutions. Gartner has noted that only 19% of companies have deployed AI.
From an architecture perspective, they are facing challenges in moving their machine learning models into production, due to the lack of speed, scale, and accuracy necessary to build the data pipeline to feed feature vectors and continuously retrain models for continuing accuracy.
In response to these challenges, businesses are adopting Machine Learning Operations (MLOps) – a practice developed for collaboration and communication between data scientists and the operations or production team. In December 2019, Cloudera called the industry to create open standards for machine learning operations. MLOps is trending because it increases automation of the relevant processes while improving the quality of ML production and meeting business and regulatory requirements.
Moreover, MLOps collaboration includes the ability to deploy machine learning projects using existing production infrastructures like Spark and Kubernetes, both on-premises and in the cloud.
As the MLOps approach becomes more widely adopted, organizations will be closing the loop between gaining insight and turning it into actionable business value.
Big data and analytics: Driving digital transformation
The tech trends we’ve discussed are being encountered in multiple industries, including financial services, insurance, retail & eCommerce, telecommunications, and transportation. They are clear indicators of a broad digital transformation already underway, driven by big data and cutting-edge analytics – and inspiring, in turn, further innovation and new data-driven business strategies.
To learn how you can simplify the deployment and management of your applications for time-sensitive data-driven decisions, create your GigaSpaces Cloud account and take it for a test drive.
This post was originally published on Toolbox on April 30th, 2020.