Unified Namespace Implementation with MongoDB and MaestroHub

In the complex world of modern manufacturing, a crucial challenge has long persisted: how to seamlessly connect the physical realm of industrial control systems with the digital landscape of enterprise operations. The International Society of Automation’s ISA-95 standard, often visualized as the automation pyramid, has emerged as a guiding light. As shown below, this five-level hierarchical model empowers manufacturers to bridge the gap between these worlds, unlocking a path toward smarter, more integrated operations.

A pyramid representing automation where data moves up or down one layer at a time, using point-to-point connections. There are 5 total layers with each layer labeled. From top to bottom, the layers are Sensors and Actuators, PLC's and Controllers, SCADA, MES, and ERP.
Figure 1: In the automation pyramid, data moves up or down one layer at a time, using point-to-point connections.

Manufacturing organizations face a number of challenges when implementing smart manufacturing applications due to the sheer volume and variety of data generated. An average factory produces terabytes of data daily, including time series data from machines stored in process historians and accessed by supervisory control and data acquisition (or SCADA) systems. Additionally, manufacturing execution systems (MES), enterprise resource planning (ERP) systems, and other operations software generate vast amounts of structured and unstructured data. Globally, the manufacturing industry generates an estimated 1.9 petabytes of data annually.

Manufacturing leaders are eager to leverage their data for AI and generative AI projects, but a Workday Global Survey reveals that only 4% of the survey’s respondents believe their data is fully accessible for such applications. Data silos are a significant hurdle, with data workers spending an average of 48% of their time on data search and preparation.

A popular approach to making data accessible is consolidating it in a cloud data warehouse and then adding context. However, this can be costly and inefficient, as dumping data without context makes it difficult for AI developers to understand its meaning and origin, especially for operational technology time series data.

A diagram depicting how uncontextualized data is pushed to a data warehouse, and how adding context to that data is inefficient and expensive. The data warehouse represented by a data layer graphic that is inside of a cloud, which represents cloud hosted applications. Connecting to the data warehouse are images representing time series data, and images that represent structured and unstructured data, which in this case is tables, PDF's, Excel sheets, etc. All the data from each point flows into the data warehouse
Figure 2: Pushing uncontextualized data to a data warehouse and then adding context is expensive and inefficient.

All these issues underscore the need for a new approach—one that not only standardizes data across disparate shop floor systems, but also seamlessly weaves context into the fabric of this data. This is where the Unified Namespace (UNS) comes in.

This image is the same as the prior image, but in this case Unified Namespace provides governance over the data flowing into the data warehouse and ensures that the right data and context is provided to all the applications connected to it.
Figure 3: Unified Namespace provides the right data and context to all the applications connected to it.

Unified Namespace is a centralized, real-time repository for all production data. It provides a single, comprehensive view of the business’s current state. Using an event-driven architecture, applications publish real-time updates to a central message broker, which subscribers can consume asynchronously. This creates a flexible, decoupled ecosystem where applications can both produce and consume data as needed.

Diagram showing how UNS enables all enterprise systems to collect data from one centralized location. In this diagram Unified Namespace is at the center with each of the following sub-categories flowing into it: MES, ERP, SCADA, Third party Apps, and IIoT Gateways. IIoT Gateways also have a couple of additional categories flowing into it, which are Sensors and Actuators.
Figure 4: UNS enables all the enterprise systems to have one centralized location to get the data they need for what they want to accomplish.

MaestroHub and MongoDB: Solving the UNS challenge

Initially introduced in 2011 at the Hannover Fair of Industrial Technologies, the core idea behind Industry 4.0 is to establish seamless connectivity and interoperability between disparate systems used in manufacturing. And UNS aims to solve this.

Over the past five years, we have seen interest in UNS ramping up steadily, and now manufacturers are looking for practical ways to implement it. In particular, a question we’re frequently asked is where does UNS actually live.

To answer that question, we need to look at popular architecture patterns, and the pros and cons of each. The most common pattern is implementing UNS in an MQTT broker. An MQTT broker will act as an intermediary entity that receives messages published by clients, filters the messages by topic, and distributes them to subscribers. The reason most manufacturers choose MQTT is it is an open architecture that is easy to implement. However, the challenge with just using the MQTT broker is that the clients don’t get historical data access (which will be required to build the analytical and AI applications). Another approach can be to just dump all the data in a data warehouse and then add context to it. This can solve the problem of historical data access but it is an inefficient approach to standardize messages after they have been landed in the data warehouse in the cloud.

A superior solution for comprehensive, real-time data access is combining a single source of truth (SSoT) Unified Namespace platform like MaestroHub with a flexible multi-cloud data platform like MongoDB Atlas. MaestroHub creates SSoT for industrial data, resulting in an up to 80% reduction in integration effort for brownfield facilities.

This diagram shows what the flow of data looks like before and after the implementation of MaestroHub. The diagram on the left shows the data flow before MaestroHub, with data running in a jumbled mess and each point of data trying to connect to several other points. On the right, the diagram shows what it looks like after MaestroHub, with each data point connecting directly to MaestroHub, simplifying the data flow and making it more efficient
Figure 5: MaestroHub SSoT creates a unified data integration layer, saving up to 50% of time in data contextualization (Source: MaestroHub).

MaestroHub provides the connectivity layer to all data sources on the factory floor, along with contextualization and data orchestration. This makes it easy to connect the data needed for the UNS, enrich it with more context, and then publish it to consumers using the protocol that works best for them.

Under the hood, MaestroHub stores metadata of connections, instances, and flows, and uses MongoDB as the database to store all this data. MongoDB’s flexible data modeling patterns reduce the complexity of mapping and transforming data when it’s shared across different clients in the UNS. Additionally, scalable data indexing overcomes performance concerns as the UNS grows over time.

This diagram shows what the data architecture looks like when using MaestroHub and MongoDB together. MaestroHub acts as the Unified Namespace, controlling the data sent and utilized by applications, while MongoDB Atlas serves as the UNS database.
Figure 6: MaestroHub and MongoDB together enable a real-time UNS plus long-term storage.

MongoDB: The foundation for intelligent industrial UNS

In the quest to build a unified namespace system (UNS) for the modern industrial landscape, the choice of database becomes paramount. So why turn to MongoDB?

  • Scalability and high availability: It scales effortlessly, both vertically and horizontally (sharding), to handle the torrent of data from sensors, machines, and processes. Operational Technology (OT) systems generate vast amounts of data from these sources, and MongoDB ensures seamless management of that information.
  • Document data model: Its adaptable document model accommodates diverse data structures, ensuring a harmonious flow of information. A Unified Namespace (UNS) must handle data from any factory source, accommodating structure variations. MongoDB’s flexible schema design allows different data models to coexist in a single database, with schema extensibility at runtime. This flexibility facilitates the seamless integration of new data sources and types into the UNS.
  • Real-time data processing: MongoDB Change Streams and Atlas Device Sync empower third-party applications to access real-time data updates. This is essential for monitoring, alerting, and real-time analysis within a UNS, enabling prompt responses to critical events.
  • Gen AI application development with ease: Atlas Vector Search efficiently performs semantic searches on vector embeddings stored in MongoDB Atlas. This capability seamlessly integrates with large language models (LLMs) to provide relevant context in retrieval-augmented generation (RAG) systems. Given that the Universal Name Service (UNS) functions as a single source of truth for industrial applications, connecting gen AI apps to retrieve context from the UNS database ensures accurate and reliable information retrieval for these applications.

With the foundational database established, let’s explore MaestroHub, a platform designed to leverage the power of MongoDB in industrial settings.

The MaestroHub platform

MaestroHub is a provider of a SSoT for industrial data, specifically tailored for manufacturers. It achieves this through:

  • Data connectors: MaestroHub connects to diverse data sources using 38 different industrial communication protocols, encompassing OT drivers, files, databases (SQL, NoSQL, time series), message brokers, web services, cloud systems, historians, and data warehouses. The bi-directional nature of 90% of these protocols ensures comprehensive data integration, leaving no data siloed.
  • Data contextualization based on ISA-95: Leveraging ISA-95 Part 2, MaestroHub employs a semantic hierarchy and a clear naming convention for easy navigation and understanding of data topics. The contextualization of the payload is not just limited to the unit of measure AND definitional but also contains Enterprise/Site/Area/Line/Cell details, which are invaluable for analytics studies. Data contextualization is an important feature of a UNS platform.
  • Logic flows/rule engine: Adhering to the UNS principle “Do not make any assumptions on how the data will be consumed,” the data should flow flexibly from sources to brokers and from brokers to consumers in terms of rules, frequencies, and multiple endpoints. MaestroHub allows you to set rules such as Always, OnChange, OnTrue, and WhileTrue, where you can dynamically determine the conditions using events and inputs via JavaScript.
  • Insights created by MaestroHub: MaestroHub provides real-time diagnostics of data health by leveraging Prometheus, Elasticsearch, Fluentd, and Kibana. Network problems, changed endpoints, and changed data types are automatically diagnosed and reported as insights. Additionally, MaestroHub uses NATS for queue management and stream analytics, buffering data in the event of a network outage. This allows IT and OT teams to monitor, debug, and audit logs with full data lineage.

Conclusion

The ISA-95 automation pyramid presents significant challenges for the manufacturing industry, including a lack of flexibility, limited scalability, and difficulty integrating new technologies. By adopting a Unified Namespace architecture with MaestroHub and MongoDB, manufacturers can overcome these challenges and achieve real-time visibility and control over their operations, leading to increased efficiency and improved business outcomes.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Maestro Hub Standard

This pricing is for on-prem. Cloud selection does not include operating costs. Additional vCPU is sold from 125 € per month. The recommended infrastructure supports up to 30 million flows/day.

The recommended infrastructure

Recommended Infrastructure starting from 36 core vCPUs. Please contact us for additional information.

Prices shown are subject to applicable tax. Prices and limits for VCPUs may be updated from time to time on this webpage.

Non-euro Pricing: Actual price may vary. Prices displayed are estimates based on current exchange rates. The exact amount charged will vary and depends on the exchange rate at the time of payment processing. Prices displayed exclude any applicable taxes and fees, which will be added to the cost of all services purchased. For exact base pricing, please refer to prices listed in euros.