NASA, where it all started
The concept of the digital twin is simple; a digital counterpart of something in the physical world, for instance, a physical object or process, and the flow of information that keeps the former up to date with the latter. While the term digital twin was first used in 2002 by John Vickers of NASA, the concept itself is much older and has been applied by NASA since the 1970s.
It is important to note the difference between a static model or a simulation, and a digital twin. NASA found this during the Apollo 13 space program, where NASA engineers could no longer rely on their models of the Apollo 13 after launch, which was no longer representative as the actual spacecraft had changed due to the hostile environment of space. Instead, they used the live data from the spacecraft to create a continuously up-to-date model of the spacecraft, which they could use for real-time decision making when incidents occurred.
The same applies not only to spacecrafts, but also to other physical assets and processes that need to be monitored continuously. Consider for instance the supply chain process of an industrial bakery; the supply chain is first modelled by describing the different suppliers and the production process, and then gradually implemented. However, during and after implementation the supply chain changes and information produced by the supply chain becomes scattered as the environments change and the supply chains adapts to these changes. For instance, one of the bakery’s suppliers can change their delivery rates, or change how efficient the production process is. This however introduces a problem; if our supply chain changes and we don’t know how, we no longer know how it performs, where the bottlenecks are, and how to optimize it.
The solution for this is the digital twin. By processing the data produced by the supply chain into a live state, decisions can always be made based on the current conditions and not based on what happened in the past. In our example from the bakery supply chain, we can make decisions using the actual delivery rates to determine whether we need to change suppliers in order to maintain a certain output. These decisions can be manual, but they can also be automated by exposing our digital twin to other applications, such as an application that determines the supplier based on the current delivery rates.
Because our digital twin is constantly updating, we can even react pro-actively by taking actions automatically based on the changes to the digital twin, for instance by immediately ordering supplies from another supplier when we find that the input is not matching our expectations. In this case, we are essentially analyzing the state changes in the digital twin, and generating exceptions based on these state changes to respond immediately to significant events.
How to build the digital twin
How can we implement and apply a digital twin? The first part is to capture your requirements for the digital twin to ensure that the scope is known and is in alignment with the goal for the digital twin. To this end, make sure to identify beforehand which users and applications will use the digital twin, and what information they are interested in.
Once you know what needs to be in the digital twin, thus what your initial model is, you need to identify all the data sources that feed the digital twin to keep it up to date. This often forms a major challenge during implementation, since obtaining access to all the different data sources can be a tedious task. Once you’ve obtained the data, it needs to be pre-processed before it can be used. This means transforming, enriching, and cleaning the data to a normalized format that can be used to maintain the state. Only once we’ve pre-processed the data we can aggregate the stream into a state database. Stateful stream processing technologies can be used for such integrations, where the state itself can be stored in SQL databases while the stream itself does not need to be stored and is processed on the fly into the state.
Consider for example our bakery, we want to capture the hourly intake of ingredients as part of monitoring our production process. Each sensor provides a data stream with the ingredients that they deliver, that we must ingest. Then, we need to normalize and clean the input data stream such that we only retain valid normalized data. In our example, this means we will convert all measurements to grams, and that we remove all null measurements. Finally, we aggregate this data per day such that we know the current ingestion rate for any of the ingredients.
Once we have our state, we need to make it explorable on-demand for users and applications. Making the data from the digital twin accessible is a key metric for success to ensure adoption and support for the digital twin over time. APIs on top of the digital twin can help to expose the real-time to any interested consumer. Depending on the requirements, this data can be integrated into existing dashboarding and reporting tools, into decision support systems, or into a dedicated UI. The latter is most applicable for complex processes or assets that require specialized visualizations.
In addition to on-demand applications, event-driven applications can be used to respond immediately when the digital twin changes. For instance, in our bakery, we may want to respond immediately if the hourly ingestion drops under the threshold value during the day. By building event-driven applications on the digital twin, we can respond as incidents occur, rather than when it is too late.
Since the digital twin is constantly evolving, it can be beneficial to use no-code, or low-code tools, such as Streaming SQL or Model-Driven tools to design and maintain stream processors that maintain the digital twin. This allows business developers to update the digital twin as needed, based on the constantly changing requirements without the need for traditional software developers. eMagiz is one of the organizations that can help you with the challenges around maintaining a digital twin using model-driven developing tools. Including tooling to ingest data from all kinds of different data sources and applications, tooling to pre-processing and aggregate data into states, expose state data using an API, and do real-time exception detection on change streams. Reach out to us to learn more on how we can help you with implementing your digital twin.
By Mark de la Court, Product owner @ eMagiz