Why stream processing is a component of true value.

Stream processing, this indicates processing data while it still streams between different streams. Why would you want to adopt it in your landscape? Read all about in in this blog where Leo describes the differences between message queues and stream processing.

In this blog

In this blog we want to shine a light on a valuable component within the integration pattern event streaming, namely stream processing. Stream processing indicates processing data while it still streams between different systems.n this blog we explain what the concept entails and why you would want to adopt it in your landscape. We elaborate on the type of architecture that fits well with stream processing and argue why it might be the tool for IoT environments. Read all about stream processing in this blog. Want to know some more about the integration pattern event streaming first? Read all about it via this link..

Stream processing is the act of processing a stream of data. As a method it is closely related to IoT, but also helpful in big data projects, particularly because its ability to process large amounts of data, fast and continuously. It is used to immediately discover and process events (changes in circumstances) during the period that the data flows from A to B.

In this short period of time, valuable new information can be immediately derived from your data streams through processing. For example a notification that is sent when a certain storage limit of your warehouse is exceeded, whereby the data streams come from a number of sensors and systems which then process the different data streams to create a warning when the calculated limit value is reached. Compared to traditional processing, the event stream, including the data, is now active and can be continuously questioned for new insights.

Figure 2. Difference between traditional database and event stream processing.

As already indicated, the value of data is nowadays determined by insights derived from processing all kinds of data. Deriving information from data at the right time can be crucial for decision making. Lag between data generation and analysis can often reduce the value of information.

By deploying stream processing within your organization you improve the circumstances to support ‘real-time decision making’ scenarios. Stream processing enables delivering insights faster, often within milliseconds after a certain event has triggered the system.

There are countless methods for data processing. Regardless of your use-case, existing integration patterns can be used to get to the desired result. However, there are now clear use cases where one might opt for a stream processing integration pattern instead of a batch processing pattern. For example, processing a continuous stream with an endless amount of events. In this flow, real-time data patterns must be recognized and the results of this process must be grouped, analyzed and processed immediately. This must be done for multiple data streams simultaneously.

Event driven architecture

Stream processing can be also described as a type of event-driven architecture that is being used increasingly to solve growing demand generated by an ever-expanding data-driven society. What is an ‘event’ driven architecture?

In an event driven architecture you have a component that performs a certain action that’s important to other components. One component (the producer) produces an event, a record of the event is stored, and another component (the consumer) consumes this event so that it can perform its own tasks as a result of (or influenced by) this event.

Separating consumers and producers gives an event-driven architecture the following benefits:

  • Asynchronous traffix
  • Individual components
  • Easily scalable
  • No additional development for one-to-many integrations

The difference between stream processing and the message queue

Er zijn twee varianten van event-driven architectuur namelijk message queues en
stream processing. Laten we even kort de verschillen tussen de twee bekijken.

Within ‘traditional’ event driven architectures, the producer places a message in a queue that is aimed at a specific consumer. That message is kept in the queue (mostly in a first-in, first-out sequence) until the consumer collects it, after which the message is deleted.

With stream processing, messages are not directed to a particular recipient, but are published on a specific topic and available to all consumers. All recipients that require access to the topic can subscribe and read the message. Because the message must be available to all consumers, it is not deleted when it’s read from the stream.

Stream processing, the tool for IoT

Stream processing is the ideal architecture to effectively process and analyze high volumes of event driven data messages.This is often especially applicable to IoT use cases, thanks to the usage of time series data. Time series data can be described as a collection of observations, obtained by continuously performing measurements. If we were to plot this data in a graph, one axis would always contain time. This type of data is often the result of using sensors in a variety of operations such as traffic, industry and healthcare. It can also be the result of log data, for example: transaction logs, activity logs and all kinds of other logs.

As promising as stream processing is, organizations must be aware that stream processing as a pattern or architecture isn’t always the panacea. There are plenty of situations and use cases in which other integration patterns will be more effective and/or efficient. An example where stream processing is not the right architecture is when the entire data set must be processed multiple times or if the processing is done on the basis of random access. Furthermore, analyzing in the ‘edge’ of the infrastructure, for example edge machine learning, is an architecture that doesn’t fit well with stream processing.

Figure 3. Retail example of two data streams, one of sales and one of incoming goods. These streams are continuously processed which produces a new real time flow with data on stock.

Stream processing has become a preferred choice for many event-driven systems and patterns. It offers several advantages:

  • Real-time decision making
  • Enrichment of ‘traditional’ BI with predictive BI
  • The ability to process and analyze high volumes of ‘raw’ data without the need of storing it first.

Generally speaking, it increases the flexibility of your data integration landscape enormously. With that, it’s an ingredient of true value for that landscape.

By Leo Bekhuis, Software Engineer @ eMagiz