Data Lakes or Digital Integration Hub (DIH): Which is best for OLTP workloads?

Organizations require accurate data that is delivered ultra-fast to serve their modern applications in real time, but when the data is stored in legacy systems, the new digital applications can’t reach it fast enough. Some organizations choose data warehouses, data lakes and semi‑structured databases, such as Snowflake and MongoDB, to hold some or all of their data.

These systems are optimized for running somewhat complex analytical workloads – OLAP (OnLine Analytical Processing) primarily for business intelligence, decision support, compliance and data mining. Those systems, however, were not built for running low latency, high concurrency workloads.

 

For operations where speed is of the essence and the transactions must be correct, fresh and consistent, OLTP (OnLine Transactional Processing) is used to execute volumes of transactions concurrently, while maintaining data integrity.

CDC to DIH

For OLTP, speed is of the essence, in parallel to full consistency, and resilience. 

Organizations that use their data lakes, warehouses or operational databases to operate as the platform for their online digital applications often run into difficulty:

plane-icon

With so many operations occurring concurrently, apps bombard the data lakes and data warehouses directly; bottlenecks form, leading to high latency or even loss of service.

plane-icon

As most organizations have data in both legacy and modern systems, the system can only perform as well as its slowest or weakest link. Downtime in one part of the process can result in the loss of overall service; for large organizations this can mean significant financial consequences.

plane-icon

Data inside the data lakes is usually uploaded in batches, so apps may be responding with stale data, resulting in inaccurate responses.

Smart DIH – Transactional, distributed and highly performant:

  • Loads data from various sources including data streams, data lakes, data warehouses and legacy infrastructure, using an event driven approach 
  • Provides secure access to both REST and Web API consumers, as well as to SQL based systems
  • Maintains full consistency at the transactional level
  • Incorporates the organization’s API management and integration layer
  • Utilizes top runtime environments and components including Kubernetes, Kafka, Flink, a proven in memory data grid, low-code service creation and more.

Smart DIH provides businesses with a solution that is far more efficient and scalable than data lakes, data warehouses or operational datastores:

green-b

Protects mission-critical systems from bottlenecks and excessive workloads since apps are fully isolated from the systems of record

Offers high availability since data is always accessible to digital applications, even when the systems of record are down.

globe

Standardizes data pipelines and data microservices, with no-code and low-code options

globe

Can be deployed in the cloud, on-premises, or in hybrid configurations.

The Smart DIH approach provides organizations with a high performing, precise distributed in-memory data platform that integrates with existing data management systems.

Learn more about data lakes and digital integration hubs

Learn More