Formerly known as Wikibon
Close this search box.

A Guide to Concepts in Digital Twins

Premise. The digital twin programming concept will extend well beyond IoT. It presents a richer representation of real things that traditional programming technologies. Users need conventions for core concepts and how they fit together.

In our conversations with the Wikibon community, we hear both interest and confusion regarding the notion of digital twins. We believe digital twins will have enormous impacts on IoT — and future classes of digital business systems. Adopting these notions, however, requires coherent conventions, which we present in Table 1.


DT IoT Artifact Definition
Digital Twin (DT) Wikibon’s definition of a Digital Twin is a representation or model of a product, process or service, customer, supplier – any entity involved in a business. IBM’s definition is a bit narrower. To IBM, a DT is a working model that digitizes the operations of a physical product and its subsystems, including mechanical, electronic, and software.  The DT was also meant to capture the structural attributes and behavior of an entity as designed, built, tested, deployed, operated, serviced. The DT can also be considered a rendering into digital terms where the fidelity is a promise and it grows over time.
Edge Device The edge device is typically the physical asset represented by the DT. Sensors instrument the edge device and analytics take place locally in order to achieve the lowest possible latency.  Analytics include predictions from machine learning models that are usually trained in the cloud where the richest and largest datasets reside. Edge devices communicate with each other via a high-speed backplane such as the wheels and breaks in a car that are self-adjusting in order to avoid locking.
Gateway  controller This server typically connects to multiple edge devices. The server is responsible for ingesting data coming from the sensors on the edge devices or physical assets, analyzing the data, and then “programming” the edge devices through their DTs. The gateway controller can aggregate and filter data from multiple edge devices. The filtered data represents a small fraction of all the sensor data but it’s what is necessary to publish to the cloud for future retraining of the models. The gateway controller also has a user interface to configure sensors on the edge devices, provision and manage software on the edge devices including models trained in the cloud or updated locally. The administrator for gateway controllers is from operations technology (OT), not information technology (IT), who tend to applications in the data center or the cloud.
Operational model The operational model collects the sensor data to create the behavioral representation of the model. The operational model captures the range of operational states of a device (elevator open, closed, opening, closing, moving up, moving down). Simulating or gathering actual data from an experiment (like a vehicle in a wind tunnel) creates an operational model. The operational model is used for creating the machine learning model(s). The ML models can be used for prescriptive suggestions such as for maintenance or to suggest a better product design. With more data or simulations, the operational model improves in fidelity over time.
Data model The data model represents the structural properties of the Digital Twin but not its operation. This structural representation gets richer over time. Ultimately, it is similar to a bill of materials for a discrete manufactured product.
API Exposes some or all of the operational model of the DT to developers. It should conform to the data model for maximum developer usability. There is a “downward-facing” data ingestion API, a backplane that does analysis (CEP, predictive, or prescriptive) and then publishes the output through an “upward-facing” API that other applications consume.
Level of detail The hierarchical structure that organizes multiple DTs and their APIs and data models. For example, the DTs for four anti-lock breaks fit within the DT for the drivetrain of a car.
Machine learning models Models can correspond to multiple levels of detail in a DT. At the lowest level, a model might correspond to a valve on a pipe which has a sensor reporting on the volume flowing through it. At a higher level, a model might correspond to a car, though there are likely to be additional models at lower levels of detail. Models can be either predictive or prescriptive. The models explain what is happening, what will happen, and with prescriptive models, you can adjust the inputs to get the optimal output. In other words, prescriptive models let you perform simulations. With a car, you can look at wind tunnel data across multiple simulations or physical experiments and optimize it for a balance of wind resistance and styling. You base the mechanics of model building on the observations in the experiment or simulation. The process also works at multiple levels of detail, like modeling the brakes which are part of the drive train which is part of the car. For each model, you do feature selection & engineering, and training. How much each feature weighs or contributes to each model is part of the training process.
ML model features

Features are the drivers or independent variables in an ML model. You can think of them as the knobs that represent volume, treble, and bass that collectively drive the sound output of a stereo. Data scientists select and engineer the features t

Challenge = risk of quality of DT representation accruing to competitors thru IBM

they believe will drive the most accurate answers when fed first training data and then live data.

ML model feature coefficients Features have coefficients, more colloquially known as weights or values, that adjust the weight of each feature in the ML model. While features are knobs on the stereo, their coefficients correspond to how the knobs are tuned. The training process for a model typically sets these coefficients so that new recordings are sound faithful. A high number on the bass produces a deep sound, independent of the volume level.
ML model hyper-parameters Hyper-parameters are the metadata that describe such things as the structure of the model or its learning rate so that it can best fit the data. Data scientists typically set these parameters manually while the model’s features’ parameters come from the data that trains the model.
ML model hyper-parameter coefficients Hyper parameter coefficients adjust the weight of each of the hyper-parameters, much the way feature coefficients adjust the weight of each of the features.
Knowledge Graph Creates common abstraction layer in the form of a data model and API that integrate multiple component data models and APIs – like the structure model from a CAD design, the operational model based on behavior observed by sensors, the maintenance model predicted anomalies in the operational model, etc. The KG knows how the pieces fit together and provide semantic consistency to an operator and a developer. In addition to a generic version, there are customer-specific extensions. But all customers should be running an instance of the canonical knowledge graph.
Canonical model Represents the generic version of a DT or data model or Knowledge Graph that has no customer-specific extensions.
Security A typical rule of thumb for enhancing security is to minimize the surface area of an object such as a DT. This can be challenging with DTs because many industrial devices in operations achieve security through physical isolation from traditional IT networks.
Backward compatibility Enhancements to the DT co-developed at one customer require that prior customers be able to upgrade to the canonical model of this most recent DT. Prior customers should also be able to add back their specific extensions to the canonical model without breaking compatibility.

Table 1: Glossary of Digital Twin artifacts

Book A Briefing

Fill out the form , and our team will be in touch shortly.
Skip to content