GigaSpaces provides advanced data integration tools with enterprise-grade capabilities that offer simpler integration and faster time-to-value, and scale on-demand to support increasing workloads. The platform integrates data into the host module to be consumed by the data services that are created and maintained over the platform.
GigaSpaces provides out-of-the-box capabilities for:
- Event-based data ingestion from source data stores; creates and manages data pipelines
- Data cleansing and validation policies with built-in validation that determine if data is rejected or cleansed
- Built-in reconciliation mechanisms, to support various recovery and schema change scenarios
- Monitoring, control and error handling
The data integration tools reduce development overhead by automatically scanning the source schema and metadata, and map them to the GigaSpaces data model. Data sources may be relational databases, no-SQL databases, object stores, file systems, or message brokers. Data may either be structured or semi-structured. Data may either be integrated as a stream or in batches.ย
GigaSpacesโ data integration is built over a pluggable connectors framework which provides seamless integrations with third-party and proprietary connectors, and offers continuous enhancements of GigaSpacesโ built-in integration portfolio.
GigaSpaces Accepts Input From Any Data Set, Including:
Change Data Capture (CDC)primarily for core data that is frequently updated, such as user transactions
Full collections/table updatesBuilt-in change management support for data pipeline definition, including adding a new table without stopping
the stream
Streamsappend data in real time, via a message queue or a bus such as Kafka
Batch updates data is extracted from the source using an ETL process; online updates are executed in incremental batches
Advanced Data Integration Tools:
Native Kubernetes Supportsupport for Docker-based processing unit deployments via Kubernetes microservices design patterns
Incremental batchrecommended for integrating slowly changing data
Full integration with OpenShift 4.4+and supported topologies for on-prem, cloud, hybrid and multi-cloud deployments
Check Data FreshnessSmart DIH defines a threshold for data freshness per source table. This is the foundation for servicesโ awareness of data being stale, allowing the application to provide an informed user experience
Redesigned SQL EngineApache Calcite SQL engine with PostgreSQL wire protocol offers low latency distributed query execution and query optimization
Data gateway client-less PostgreSQL wire protocol guarantees seamless integration and can scale on demand to support increasing workload, integrating with:
- Business intelligence (BI) tools such as Power BI and Tableau
- Developer tools such as DBeaver
- Data Integration tools such as talend