During the past few weeks I had the honor to have a discussion with one of our partners Integrasoft who developed a distributed Complex Event Processing engine on top of Esper a popular opensource Complex Event Processing engine and recently integrated their solution into GigaSpaces.
As many of GigaSpaces users already design their application in an event driven fashion I thought that the CEP solution proposed by Integrasoft may be a natural fit.
Below is a summary of a discussion i had with Michael Di Stefano where he provides an overview on Integrasoft and CEP and later explain their specific solution and how GigaSpaces users can benefit from it:
Michael can you tell us a bit about yourself and Integrasoft?
We help clients and product companies optimally match business need with the technology that best meets that need. This is the fundamental philosophy is embedded in Integrasoft, the company I founded in 1997. Integrasoft embraced Grid computing when hit Wall Street almost a decade ago, and quickly promoted the importance of data affinity and the shift of the most important resource in an IT environment from CPU to the network. Today Clouds and Complex Event Processing (sometimes called CEP, Event stream Processing, and Business Event Processing) are the leading technologies addressing business need in not just Financial Services but across industry sectors.
True to form Integrasoft has taken CEP’s “server” approach to event processing and have adopted it to a pure distributed computing architecture. CEP Service Clouds abstracts CEP engines and provides a framework to network them together forming a Cloud of CEP that can host “services” in the cloud. These services can take advantage offered by CEP -AND- the virtualization and scale of a cloud. With CEP Service Cloud one can push the processing out to the fringes of the cloud eliminating unnecessary data movement from a data source to today’s centralized CEP Servers.
What CEP stands for?
A definition of Complex Event Processing can be found on Wikipedia
Complex event processing, or CEP, is primarily an event processing concept that deals with the task of processing multiple events with the goal of identifying the meaningful events within the event cloud. CEP employs techniques such as detection of complex patterns of many events, event correlation and abstraction, event hierarchies, and relationships between events such as causality, membership, and timing, and event-driven processes. CEP is to discover information contained in the events happening across all the layers in an organization and then analyze its impact from the macro level as “complex event” and then take subsequent action plan in real time.
And like all “buzz words” CEP is now commonly applied to many products. Is CEP a pure marketing buzz or something else? What is different about CEP form other middleware infrastructures of recent years (Grid, Data Grids, distributed caches, etc) is that CEP itself is an evolution of what IT professionals have been doing on Wall Street for almost 15 years. Custom code and rules engines are the fore runners to today’s CEP products. Today’s products are general-purpose tools that allow developers to take advantage of all it has to offer without having to know the intricacies happening inside the engine. Is this to say that a rules engine or custom code is not CEP? Not at all, they well address event processing in the scope of the problems they are designed to meet.
Events or transactions occur throughout the enterprise. These events may or may not be dependent on time, may or may not be dependent on sequence, and may come from one or many systems or applications within a business group or across business groups. So in order to consume the events regardless of source, understand the event, correlate it across other events with regards to time and source, or sequence in events, one need to leverage CEP as part of the architecture.
The CEP technology has been around for years and is leveraged by many firms. With the technology evolving over the years with a family of CEP engines available through vendors and open source, the choices of which engine(s) to leverage are wide. The key things to be concerned about as this technology is being leveraged is the consumption of these events in a way that is scalable and out of the box thus creating a distributed CEP Cloud Services.
What is a typical architecture of a CEP solution?
As with most architectures, the 50,000 foot view is similar to all other event driven systems. Event sources, processing of the events, event consumers and all the integration necessary to join (either tight or loose coupling) the systems together.
- Event source: – This would be the actual messaging provider (GigaSpaces in our specific case)
- CEP engine – This is the heart of the CEP. Matching engine is responsible for processing incoming event against users queries. Users queries may require a history of events to fulfill a given query. Users can express their queries in an SQL like semantics. The matching engine will trigger the appropriate listener when a query conditions has been met.
- Event consumers– The specific logic components (e.g. external systems or other middleware components such as GigaSpaces) that are be triggered when a certain condition happens.
However as you dive closer into”CEP” architecture significant differences emerge. Joining event streams together in logically similar fashion as joining tables in a database greatly simplifies implementations while reducing latency and increasing throughput. The result is smaller and simpler code base therefore a faster delivery of function. However, if not careful these benefits can quickly erode by trying to do too much inside the CEP Engine where the complexity of the streaming rules affect performance and maintainability.
What are the differences between a complex event and regular event?
Quoting CEP provider Esper:
“Regular events normally represents a concrete state, a complex event is normally an aggregation of multiple events (not necessarily of the same type) that identify a meaningful event.”
What are the difference between standard queries and continues queries?
A standard query, associated with a database operates over the tables in the database and returns a result data set. The tables stand still and the query traverses the data set (tables in the database). They run once and exit.
Continuous Queries are always “on”. However the query itself is stationary and the data (events) streams through the query, when the conditions of the query are met a resulting event is generated into the event engine for further rule processing and/or output to subscribers to that event. Visualize the query as a 2 dimensional plane and the various data streams orthogonally passing through the query plane for evaluation. The resulting events emerging on the other side of the query plane are new events resulting from the query conditions evaluating as true.
What are the difference between CEP and other MOM such as JMS?
MOM and JMS are transports moving data across a network. CEP is a process that may uses these middleware tools as a mechanism to consume and publish event streams.
What does Integrasoft add to the current CEP engine?
CEP engines have been leveraged in many business applications as siloed solutions. With a great push into the Cloud infrastructure for High Performance Computing, many firms are finding the need to run across multiple CEP engines for Services regardless of the nature of the Cloud.
Integrasoft CEP Cloud Services product offers the infrastructure required to run CEP Cloud(s) to meet the business application demands and required services via CEP engines “networked” together running transparently for virtualized services. The solution offers holistic and intelligent CEP Services within the Cloud.
The CEP Cloud Services form “Clouds of CEP” to be leveraged as needed by applications within a heterogeneous environment that is typically found in any infrastructure. Integrasoft CEP Cloud Services allows for the processing of complex events, scalable with the cloud, and transparent to the applications (see diagram below).
As illustrated in the diagram below, the solution offers:
Transparency – The applications need not know about the underlying architecture.
Scalability – “Sub-Clouds of CEP Services” within the cloud if needed.
Holistic & Intelligent Services – Intelligent processing of all events against defined rules. Holistic view of what is going on in the cloud.
Inter-CEP Engine communication – CEP engines framework to establish the needed communication for total transparency, scalability, and manageability.
What is the value for GigaSpaces users?
Leveraging the CEP engines in a public or private cloud for services comes another degree of challenges. The business applications development and deployment are concerned about:
· Underlying middleware technology
· Underlying Database technology
· Underlying business event processing technology
These challenges impact development time and cost as well as time to market. With the integration of Integrasoft CEP Cloud Services and Gigaspaces XAP, the solution offers a complete services infrastructure solution to handle all business events with:
· Integrated solution for event processing
· Virtualized CEP Cloud(s) Services
· Distributed cache for HPC
· Underlying distribution messaging
· Underlying database persistence
· Easier deployment and management
The solution offers high performance CEP + Messaging + Distributed Cache which is typically need by HPC applications. The tight binding of these three functionalities offers a new class of applications that are high event rates, data intensive, complex relations and correlations of events.
Now, the business SLAs demands can be met with a linear performance scale of a Cloud, constant latency as event rates increase, where event sources and processing of events are physically close together to maximize efficiency and performance. In summary, the solution is geared to control costs and minimize operational risks which are inherent with Cloud infrastructure.
Event Driven Architecture (EDA) is becoming more popular in recent years due to the demand for greater scalability. CEP provides a way to extend the use of EDA into the way we process and access our data. Having said that most of the existing CEP solutions relies on a centralized event coordinator and could become a scalability bottleneck as a result of that. What’s interesting in the approach taken by integrasoft is that it brings the scalability of GigaSpaces to CEP through the integration of Esper and GigaSpaces.
I would be very interested to learn about your specific requirements in that area and work collaboratively with Integresoft to make sure that the integrated solution can address your needs. Feel free to contact me or Michael directly on that regard or simply post a comment on this post.