Extract, Transform, Load—or more commonly known as ETL system extracts data from the source systems, enforces standards, conforms data so that different sources can be used together, and delivers data so others can build applications. ETL tools have been around for 50 years however Industry 4.0 brings different challenges on critical characteristics of ETL tools. Industrial ETL has unique and sophisticated requirements. We need to reevaluate data architecture and implement new industrial data infrastructure solutions.
Extract: Not every data source from the shop floor is ready to be extracted as traditional transactional data from a database. Most of the time the data is not stored, it changes in real-time and you need to stream data from hundreds of equipment and systems.
Transform: Contrary to clearly defined and standardized transactional data, most of the data sources on the shop floor even for similar machinery are not correlated, in a usable format. There are no descriptions, units of measure, operating ranges, or other descriptive information. For any sort of data analysis to be possible, the data points must be standardized, normalized, and in some cases calculated based on component measures like conditioning Fahrenheit temperature values into Celcius.
Load: Managing the delivery of data is important. There are significant costs as well as security risks associated with storing incorrect, corrupt, or useless data. Extraction and transformation should be edge-driven, handled as close as possible to machinery. Always clean and contextualized data should be used by local edge analytics, should be stored on-premises data centers, or in the Cloud for the most efficient use.