A major aspect of Industry 4.0 is the acquisition and analysis of big data, turning data into actionable information and enabling systems to make decisions on their own. Despite new technologies, most manufacturing businesses still use clipboards and paper to collect data and information. In many cases, as much as 90% of data ends up sitting on-site in orphans. This presents certain challenges if one wishes to benefit from Industry 4.0.
The good news is that new technologies can help in this regard, and users can prepare for data transformation in just a few simple steps, including accessing more data, doing edge computing, cleaning data, and contextualizing data As well as standardizing common data structures.
Getting more data is an important part of Industry 4.0. The operating environment of industry is complex, involving knowledge of hundreds of different protocols, communication media, and legacy equipment. The realization of digital transformation must be a bottom-up implementation, starting with the optimization of operational technology (OT). This requires a new attitude to keep the system open, interoperable and secure. The first step is to get all the data in an efficient way – being able to easily access it from anywhere when you need it.
MQTT’s publish/subscribe functionality simplifies communication and helps move polling to the edge of the network. Image source: InducTIveAutomaTIon
One of the main barriers to data access is the traditional software licensing model, which charges per tag or user. These models cannot scale, hindering growth. Furthermore, industrial applications have always been closed, proprietary, and limited in functionality and connectivity. Today, we need fundamentally unlimited, open new models. These new modes can unlock new expansion opportunities and greater scalability.
Another challenge is the balance between the fusion of new smart sensors and devices with existing legacy devices. It’s important to have infrastructure that supports both. It boils down to one key concept: architectural change. Instead of connecting legacy devices to applications through communication protocols, connect devices to infrastructure. There is a need to provide a plug-and-play, reliable and scalable OT solution to meet the needs of operators.
Open and Interoperable Architecture
This is a new architecture based on Message Queuing Telemetry Transport (MQTT). MQTT is a publish/subscribe protocol that enables a message-oriented middleware architecture. In the information technology (IT) world, this is not a new concept; Enterprise Service Buses (ESBs) have long been used to integrate applications through a bus-like infrastructure. When an abnormal event occurs in MQTT, the device data will be published to the local or cloud-based MQTT server. The application subscribes to the MQTT server for data, which means there is no need to connect to the end device itself. MQTT has the following advantages:
● Open standard/interoperable (OASIS standard and Eclipse open standard (TAHU));
●Separation of device and application;
●Requires very little bandwidth;
●Transport layer security;
Remotely initiate connections (outbound only; no inbound firewall rules);
● Status awareness;
●Single source of data;
●Automatic identification of labels;
● data buffering (store and forward);
●Plug and play function.
How to get the new schema? The answer is edge computing and protocol switching. Suppose there are 10 Modbus devices to be connected to a supervisory control and data acquisition (SCADA) system. Users can deploy an edge gateway that supports Modbus and MQTT to bring polling closer to a programmable logic controller (PLC). This allows users to poll for more information at a faster rate and publish values to a central MQTT server as they change. Instead of connecting directly to end devices, SCADA can also be connected and subscribed to an MQTT server for data. This is an important step in ensuring that SCADA systems are future-proof. When a user purchases an MQTT-enabled sensor or upgrades a device, SCADA can gain access to data without knowing the end device.
Help the system understand the data
Not only do users need to access the data, but they also need to ensure that the data is valid, has contextual information, and is part of a common structure (if applicable). This is an important step before using analytics and machine learning. The system can only use the data correctly if it understands the data. Often, new sensors and devices already have these capabilities. However, this is not the case for older devices. There are hundreds of different polling protocols that need to be mapped and extended. The addressing scheme of most PLCs is not easy to understand. These mappings often exist in SCADA, but still lack contextual knowledge, or contain invalid data, or are not standard data structures.
The best place to do this is at the edge gateway connected to the PLC. It requires software with capabilities to clean data, contextualize data, and support data structures.
Let’s start by cleaning the data. Suppose there is a sensor connected to a PLC, sometimes the signal is lost. When the signal is lost, the value in the PLC will drop to 0. But at some point, the value may indeed be 0, but if the last value was 50, the current value may not be 0. In this case, you need to look at the data changes to determine if the current value should be ignored. Using this logic to set the calculated labels, this problem can be solved. Before sharing data with other systems, be sure to confirm the validity of the data closest to the source.
Another critical step is providing context to the data. For example, on the user’s ModbusPLC, there is a tag with a reference address of 40001. In a SCADA system, it would be mapped to some label, such as “ambient temperature”. If you only have this data, it is impossible to know whether the temperature is in degrees Celsius or Fahrenheit, and what the high and low limits are. Without context, data analytics and machine learning models can provide false data.
Using an edge gateway that provides information such as name, scale, engineering units, engineering upper and lower bounds, documentation, and tooltips will provide other systems with critical information to better understand the underlying data.
Standardize common data structures
The final step is to standardize common data structures across the enterprise. This step is often ignored since the data can be different for each site and it is difficult to find a common data model. Analysis packages and machine learning models require the same data structures for common objects. Users don’t want to create different analytics or machine learning models for each site. This goes beyond a single data point and is a collection of data points for a known object.
It is important to investigate each site to find a common model and use an edge gateway that supports User Defined Types (UDTs). This means adjusting the data for each site to fit the model, which may include scaling, calculating labels, transforming, etc. This way the data appears to be the same structure on the surface, while hiding the complexity behind it.
Solve the problem of importing big data into the infrastructure by setting up a new operational structure. Users cannot perform analytics and machine learning until the data is accessed, so the data must be valid and have context for easy understanding. Using this new mindset and infrastructure, users can realize the potential benefits of advanced technology with a few small steps and adjustments.
Link to this article：A major aspect of Industry 4.0 is the acquisition and analysis of big data