MQTT is an OASIS standard messaging protocol for the Internet of Things (IoT). But what can you exactly do with it? It is designed as an extremely lightweight publish/subscribe messaging transport that is ideal for connecting remote devices with a small code footprint and minimal network bandwidth. This is an introduction message stated on https://mqtt.org/. The question is whether MQTT solves all you messaging challenges.
In some industries, IoT devices are verry common and for sure it's not a new thing, but interacting with data streams from sensors has evolved throughout the years.
There are many traditional ways to communicate with IoT Gateways and almost every supplier also has their own included software and custom steps to interact. The raw sensor data is transmitted, stored, and can be retrieved at any time. Although the data can be of different types, like equipment, environmental or submeter, the main characteristics are most likely the same:
- Very large volumes of small data packages
- Most data will be repetitive time series
So, why is this relevant to mention? Because raw data itself is useless . In order to turn sensor data into meaningful and useful information, retrieving and processing it will be necessary. With large volumes of repetitive data, processing it can be quite challenging. , not the problem of processing or using the data in a meaningful manner. MQTT is so efficient because it decouples the data and does not rely on synchronous poll/responses. If you’re interested in some history from the co-inventor of MQTT himself, this article is recommended.
Processing style
In order to turn data into useful information, processing is required. The manner in which data is transported should facilitate processing later on in the process. Processing can be done in a number of ways, a traditional one being batch processing. Although it is widely used, batch processing can be cumbersome in many cases because:
- 90% of the data has not been changed or is not significant enough. For example a temperature sensor that always registers the same temperature. You only want to do something when the temperature significantly drops or raises.
- 90% of the time it will be too late, because there is lag between data collection and processing. For example, if a railroad switch is signaling its own malfunction, you don’t want to wait till the next day.
This doesn’t mean you cannot process your IoT data stream in batches, because there are still cases where the amount of data is manageable or you simply don’t have that much data. In those cases, you can store the data somewhere in a database and perform analytics on it. Using a MQTT client to access your data will make it easy for you.
If you do have large volumes of data and you don't want to be too late to react, real-time processing is more likely needed. Everyone with some experience in software architecture knows: real-time means hard work and the requirement of a capable and robust infrastructure, including the right tools to be able to react on ongoing events and maybe even more important: knowing on which events not to react, because they are irrelevant for your decision making. Storing all data in old fashioned databases and querying them recurringly will not work in this fast-paced era. While Event Streaming is a popular answer and addresses many challenges in real-time processing you still need something to connect your raw data streams to this engine before you can utilize it. MQTT provides the solution for this challenge.
Connect your data streams to your processing engine with MQTT
Connecting your IoT data streams to your backbone, that is exactly what MQTT can do. MQTT and Kafka are often confused as alternatives to each other, but they differ in purpose and can be used complementary to each other. The confusion around MQTT and Kafka is understandable, because both implement an asynchronous publish/subscribe pattern.
While Kafka aims at distributed persistent data preservation on topics, MQTT focuses on reliable communication between client and broker. Another difference: MQTT is a protocol and Kafka is a solution which requires implementation of the Kafka client in your software stack. Basically when data is transferred from IoT and edge devices to the broker, the main job of the MQTT client and broker is achieved.
For your Kafka solution, the main functions starts here: preserving the data on Kafka topics with the configured retention, so your processing engine can do the heavy lifting. But before any processing can start, all data should be filtered of irrelevant data. Like mentioned earlier: 90% of the data has not been changed or not significant, so probably they can be dropped and are irrelevant for processing.
Which version to use?
eMagiz added support for an MQTT broker a few weeks ago. Connecting to MQTT clients was already technically possible, but now it is also possible to equip your message broker with an MQTT gateway.
It acts as a bridge for your core and can be used as a gateway linking your IoT data with your event streaming solution. All clients that support version 3.1.1. can publish their data to this MQTT broker. This version is not the latest, which is version 5.0. Although version 5.0 differs from version 3.1.1 with some improvements, it is not widely used and is not (yet) accepted as a de facto standard. Version 5.0 makes the protocol more flexible and takes up even less bandwidth. However, many vendors and therefore software still only support version 3.1.1.
When implementing MQTT, using and choosing the most suitable QoS configuration is important and can be challenging.
Next time I will elaborate more on this and related subjects. Please let me know your opinion or experiences via a comment or via LinkedIn. Thanks for reading!
By Samet Kaya, Software Delivery Manager @ eMagiz