Secure Interconnection of IT-OT networks in Industry 4.0

Increasingly, the society is witnessing how today’s industry is adapting the new technologies and communication protocols to offer more optimal and reliable services to end-users, with support for inter-domain communication belonging to diverse critical infrastructures. As a consequence of this technological revolution, interconnection mechanisms are required to offer transparency in the connections and protection in the different application domains, without this implying a signiﬁ-cant degradation of the control requirements. Therefore, this book chapter presents a reference architecture for the new Industry 4.0 where the interconnection core is mainly concentrated in the Policy Decision Points (PDP), which can be deployed in high volume data processing and storage technologies such as cloud and fog servers. Each PDP authorizes actions in the ﬁeld/plant according to a set of factors (entities, context and risks) computed through the existing access control measures, such as RBAC+ABAC+Risk-BAC (Role/Attribute/Risk-Based Access Control, respectively), to establish coordinated and constrained accesses in extreme situations. Part of these actions also includes proactive risk assessment measures to respond to anomalies or intrusive threats in time.


Introduction
Industry, in general, is accepting the incorporation of the new technologies, networks and communication protocols to modernize its systems and allow a wider connection from anywhere, at any time and in anyhow. There are already several related works reflecting this progress [66,30,19,65,16], in which multiple cyber-physical devices interact with control processes and manufacturing chains for greater production, distribution and quality of service. This technological confluence is mainly based on the new paradigms of the Internet of Things (IoT), such as the Industrial IoT (IIoT), and the new edge computing infrastructures, such as cloud and fog computing [34]; all of them working as part of a heterogeneous network where Information Technologies (IT) merge with the Operational Technologies (OT), in order to maximize, optimize and customize the production tasks, and offer a greater range of functional possibilities and services for a better industrial sector, economy and society [21].

Interconnection architecture for Industry 4.0 scenarios
When different application domains need to be interconnected each other, it is commonly applied interconnection frameworks based on Policy Enforcement Points (PEP) and PDP [60]. Through PEP, entities (i.e. physical members, IT-OT devices or software processes) can request access to the different resources of the system. In this case, the PEP intercepts and forwards the request to the PDP so that this latter can manage the authorization policies and determine the access level to the different sections of the system according to a set of factors: The type of entity, the resources and the context. Once the decision is taken by the PDP, the PEP processes it to permit or deny access to the interested entity, thereby protecting the critical resources of the system. This way of connecting systems can also allow today's industry to interconnect industrial multi-domains, at which the creation of a cooperative environments is generally required to transparently connect providers, customers and other industrial networks [66]. In this sense, our architecture should follow a collaborative interconnection model where interconnection components (i.e. PDP) should maintain certain information of the own federated network. The architectures presented in [25], [11] and [10] are clear federation examples. The former is a patent where users and domains are able to transparently connect each other. The patent characterizes the inter-domain communication through an additional Meta Policy Decision Point (MPDP) to manage authentication and authorization processes between domains. The works [11] and [10], to the contrary, assign all the authentication process in the respective domains and concentrate all the authorization process in intermediaries PDP working like proxies.
If we unify both ideas and adapt them to our architecture, we can find a way to connect different industrial domains together with their application sub-domains, at which different protocols and technologies can coexist. To do this, we assume the following structural conditions, technologies and stakeholders: Structural conditions: Today the new industrial revolution accepts the inclusion of the new IT to manage, manipulate and store operational data and processes. This also means that industrial networks have to protect IT-OT connections through perimeter protection elements such as industrial firewalls and/or Virtual LAN (VLAN) for segmentation, Intrusion Detection/Prevention Systems (IDS/IPS) and Virtual Private Networks (VPN) for a secure tunneling through IPSec.
Technologies: Apart from the technological diversity in control terms (e.g. sensors, actuators, controllers − remote terminal units or programmable logic controllers −, robot units, etc.) and the proliferation of industrial communication protocols (e.g. OPC-UA, 6LowPAN, IO-Link, EtherNet/IP and EtherCAT, WirelessHART, ISA100.11a or ZigBee PRO) [6,16], there is an important need to integrate IT services to render large industrial data streams and processes. Among these IT services, we stress the cloud and the fog computing [34], which can compute contextual information for future administrative or operational actions, and benefit the control (per domain) and the processes related to context management, predictive maintenance, detection of anomalies and equipment failures, performance monitoring, governance, auditing or forensic.
As for security, it is widely assumed that all sections of the interconnection system, including the Machine-to-Machine (M2M) communication between devices, are protected through the existing security mechanisms and standards [33]. Beyond the perimeter protection, cryptography, key management systems, identity management, access control and traditional security protocols such as Secure Sockets Layer (SSL) or Transport Layer Security (TLS) are also essential for processing, storing and transferring critical data from a secure perspective [24]; without ruling out high-level security services such as privacy, trust or quality of service [12].
Stakeholders: As stated in [16], customers and providers may also be part of the operational procedures to accelerate, customize and optimize the manufacturing and logistic processes, maximizing operational performance and costs in the plant/field. This also means that the model proposed should allow the influence of external connections with access to IT networks, such as the cloud or the fog. From the set of entities specified in [35], we also identify, among others, the participation of engineers, auditors and security administrators since they can interact with the system to offer essential actions for the production and distribution of minimal services to end-users, such as energy, water or food.
All these assumptions are also illustrated in Figure 1. This figure clearly represents the technological confluence of the new Industry 4.0 composed of diverse operational and control areas, and multiple types of stakeholders. As can be observed, each domain comprises a set of OT devices working with different communication protocols and interacting with IT networks, such as industrial wireless sensor networks, RFID (Radio-Frequency Identification) or fog-computing. The role of the fog-computing is to locally provide a mean of processing and storage of large volumes of data, the information of which can also be compiled by a federated cloud infrastructure, common for all the application domains. The cloud technology, to the contrary, serves as a holistic environment capable of managing data related to users, control and context belonging to the different "smart world" scenarios (e.g. smart factories, smart grid, smart cities, smart health-care, etc.), the services of which are fundamental for social and economic well-being.
To articulate all these connections, the architecture accommodates two classes of PDP: One global to the entire system and another local to each application domain. The global PDP is shaped in the cloud to (i) receive information of the context from each local PDP deployed in the fog and (ii) offer an overview of the state of the entire system and its correct performance. The PDP in the cloud is denoted here as PDPcloud and the PDP in the fog is called MPDP-fog in relation to the MPDP described in [25]. The access to each one of these two kinds of policy decision points relies on the type of entity (human operators, providers, customers, administrators, auditors, engineers, processes or IT-OT devices). Local entities linked to local operational actions in the field or in the process plant should consider the access through its corresponding MPDP-fog; whilst remote entities (administrators, engineers, operators located at SCADA (Supervisory Control And Data Acquisition) centers, providers, auditors, etc.) to the different local domains should access through the PDP-cloud. This functional characteristic is also illustrated in Figure 2. Figure 2 is an example of how remote stakeholders are able to gain access through PEP instances to the PDP-cloud. However, the secure interoperability between IT-OT networks, the devices of which generally present performance limitations [10,7], adds the need to locally delegate all the authorization process and translation actions of security policies and communication protocols to the MPDP-fog nodes. This condition endorses that the PDP-cloud is only able to authenticate external entities and validate the access according to the context, leaving all access responsibility to the meta PDP. In this way, the architecture simplifies the centralized actions in the cloud and any bottleneck occurrence. Note that this restriction is also subject to M2M communications of each domain. In this case, the authentication procedure is concentrated in each MPDP to locally handle PEP calls between domains and unburden the cloud of these operations. Figure 3 represents the architectonic design of the modules that integrate the PDP operations required between entities and domains. Particularly, the architecture adds two chief components: The PDP manager and the context awareness manager. The former is in charge of validating the authentication tokens provided by each entity. This means that each entity must authenticate by itself from its own organization, delegating all the authorization process in the policy decision points.

PDP-cloud: modules and functionality
Authentication is a procedure required to validate the identity of an entity and favor legitimate access to resources of the system. If the authentication is made from the entity premise and the access through the cloud, then it is required to consider the solutions described in [13]. This survey classifies the methods according to the location of the authentication modules, where the methods implemented in the "entity side" are mainly based on identity and context schemes. Chow et al., for example, define in [52] an identity-based authentication scheme, the core of wich is focused on the zero-knowledge authentication, the digital signature, and the fuzzy method. In contrast, Schwab and Yang specify in [18] a federated authentication framework, known as TrustCube with similar features to the OpenID technology, managing multiples types of policies related to the platform, devices and users. This way of authenticating in the entity side would not only reduce maintenance costs of databases in the cloud side, but it would also benefit the user's mobility. Human operators, engineers or even customers using mobile devices within a specific application scenario, such as manufacturing plants in smart factories or smart grid substations, can request PEP instances from any where, at any time and in any how, thereby promoting the new paradigms of the IoT; i.e. the IIoT.
But despite this local procedure, any validated identity in its premise also has to show its authenticity and legitimacy in the PDP-cloud through the use of authentication tokens. These tokens should add certain information about the previous authentication process and specific information about the PEP request, such as: The identity of the resource and the domain, and the type of action to be performed on the resource. All this information is compiled by the PDP manager together with additional information related to the roles and permissions assigned to the entity, the criticality level of the context in which the resource is being deployed and the risks associated to that context. The context information is obtained through the context awareness manager, responsible of computing the level of observation and controllability received from the application domain itself. This information is generally associated with attribute values that explain among other things: Which sensors, actuators or controllers are isolated, how many sub-areas are segregated, which nodes are working and which are not, status of communication links, operating systems or network parameters, etc.
Apart from the authentication module of tokens, the PDP manager is also composed of two further components: The access token manager and the access prioritization manager. These two components are based on the Role-Based Access Control (RBAC) strategy as recommended by the standard IEC-62351-part 8 [35]. Concretely, the standard defines seven specific roles for engineering and control scenarios managing different types of rights, such as the human operator with the capacity for viewing, reading, reporting and controlling operational objects and processes, or the engineer with the ability for viewing, reading, reporting, configuring and managing objects, databases and file systems. In addition to these roles, the standard reserves until 32.767 roles for private use, allowing to allocate new Industry 4.0 stakeholders as identified in Section 2. In our case, we could define capacities for viewing, reading and reporting operational objects assigned to auditors and customers, adding configuration support to providers.
This way of orchestrating permissions together with the dynamic capacity of RBAC for separation of duties, commonly known as Dynamic Separation of Duties (DSD), permits the system to redistribute security controls according to the security policies of each organization and the contextual conditions, adding versatility in the approach and dynamism in the protection process. To do this, the risk assessment manager, included as part the context awareness manager, has to compile all the information from the domains and contrast the existence or the persistence of possible risks [49] in the domains demanded where the control should prevail in extreme situations. This means that each entity should support at least two roles, one working as primary and other as secondary; and in this way, when control areas lack of enough connectivity, only authorized entities with determined roles could gain access to the affected area and take the control of this one. This propriety of DSD is widely described and implemented in [11,10].
The context can also be managed by the early warning manager to estimate in optimal times and from a local or global perspective, the real state of the system for the next stage; and in the worst case, to prepare and activate the protection mechanisms related to location and alerting of human operators, as well as establish the prioritization levels taking into account the DSD properties. Any estimation must be loaded to the database for future risk assessments, in which a set of parameters should be evaluated, such as: The frequency, the relevance and the severity of the threat in the different domain/s, the criticality of the scenarios and their resources, the degree of devastation and the consequences (e.g. in social or economic terms), etc. The computation of all of these inputs will allow to compute and estimate any cascading effect between subsystems or systems, track and visualize in real time the threat in order to tackle the problem, and improve the regulatory procedures related to governance, auditing and forensic.
All this context information is part of the Policy Information Point (PIP) as specified in the RFC-2904 [60] for the interconnection of systems. A PIP refers to the management point where a set of attribute values related to resources, subjects and environment is compiled and normalized, to later determine the severity degree of the area and permit or not access to the area. This features also allows us to adapt the methods Attribute-Based Access Control (ABAC) and Risk-Based Access Control (Risk-BAC), and combine them with RBAC, in order to further restrict access conditions. Through ABAC+Risk-BAC, it is possible to take more stringent decisions established by the real attributes of the context and the risks associated with that context [57], further delimiting the access conditions by dynamically managing roles. In the literature, there several related works for IoT and IIoT environments [38,29,14], which can be considered for future implementations.
Finally, the access manager, integrated in the PDP manager, computes not only the information received from the respective modules but also verifies the legitimacy of the permissions to be executed in the field. For this action, it is necessary to contrast the information with the security policies stored in the databases, which are managed by technical administrators, installers or engineers through Policy Administration Points (PAP). Once the information is processed, the manager generates an access token to later validate the entity and the access itself in the destination domain. To accelerate the management of future related PEP instances or detect possible abuses in the requests (i.e. replay attacks), the access manager also needs to keep a temporal copy of each instance managed through a cache memory.

MPDP-fog: modules and functionality
This section presents the architectonic model of the meta policy decision points configured in the respective fog infrastructures installed in each of the application domains (see Figure 4). Similar to the PDP-cloud architecture, each MPDP-fog includes two chief modules: The access manager and the domain awareness manager. The first module contains an authentication component capable of addressing two types of actions depending on the origin and the class of token: (i) Verify the authenticity of the access tokens received from the cloud and (ii) validate the identity of those PEP instances established from other domains.
In this state, the technical capacities of the technologies are also keys to determine the authentication mode. For example, M2M communication based on IIoT devices and manufacturing machines (e.g. sensors, actuators, controllers or robots) are not generally tamper-resistant to attacks and they are based on constrained hardware components [4], working by themselves at remote locations such as substations or operational plants [40]. To reduce computational and communication overheads, the use of lightweight authentication schemes at the application layer and security protocols at the transport layer (TLS or Datagram TLS (DTLS)) are extensively considered in the literature [28,1]. However, the design of lightweight solutions (at the application layer) for certain paradigms like IIoT, is still a great challenge for the scientific community [67]. In this case, we stress some works related to cyber-physical systems and IIoT such as [26], [53] and [17]. Esfahani et al. propose in [26] a mutual authentication mechanism for M2M communication using simple primitives and mathematical operations (hashes and XOR), thereby simplifying the authentication processes. In [53], the authors, to the contrary, offer an authentication framework to validate the identity of each object in the IIoT according to the device-specific information; and in [17], Chin et al. similarly propose M2M two-layer-based authentication framework for smart grid scenarios where smart meters are authenticated by a public key infrastructure and digital signature.
At the transport layer, there are already available several communication protocols for IoT, such as [58,64]: 6LowPAN (IPV6 over Low power wireless Personal Area Network) [47], MQTT (Message Queue Telemetry Transport) [44,58], AMQP (Advanced Message Queuing Protocol) [58], XMPP (Extensible Messaging and Presence Protocol) [61], DDS (Data Distribution Service) [45], and CoAP (Constrained Application Protocol) [48]; all of them supporting authentication measures through SSL and DTLS sessions. Namely, all the protocols except CoAP are based on TLS, whilst CoAP is focused on DTLS [26,1]. Moreover, XMPP and AMQP can also use the Simple Authentication and Security Layer (SASL) protocol to authenticate devices [42,43]. However, for all these protocols and the existing works related to the IoT field [37,31,63,62] is recommendable to verify the suitability of the approach taking into account the technical restrictions of the IIoT devices together with control requirements as specified in [7].
Continuing with the actions of the access manager, the system has to validate all the previous states before computing any new access request. The goal is to reduce any computational cost involved in the context evaluation and translation of security policies and communication protocols. As stated in Section 2, the operational performance is critical at this interconnection point since multiple and concurrent access requests are generally demanded in this stage; either from the cloud or from any application area (through a new PEP request). To ensure this performance level, the system needs to temporarily cache all the actions performed by the access manager to avoid passing through the translators of communication protocols and security policies. Normally, both modules demand computation and time to address translation tasks considering the management and updating of specific tables for the matching of protocols (including ports and IP addresses) and policies. Nonetheless, this computational consumption is heavily dependent on the type of implementation designed for the translation engine. For example, the work [23] proposes a protocol translator for industrial communication based on a service-oriented architecture, translating on-demand and at a low-latency cost; whilst [36] traduces the communication according to algebraic specifications and [11,10] are based on a rule-based expert translation system.
In either case, these translations benefit interoperability tasks in such a way that IIoT entities in general, can connect with each other transparently as stated in [11,10]. Both works reflect similar goals to the proposed approach, in which different interfaces can establish connectivity without need to follow an equivalent security policy criterion for all parties and taking into account the natural conditions of the context to activate the DSD mechanisms if they are necessary. To go beyond these two works, our meta PDP nodes are not only able to handle the access according to the RBAC+ABAC properties, but they are also able to proactively determine the accessibility level according to the risks of the context. At this point, the risk management is critical to locally determine the severity degree of a threat and assess the consequences to establish much more restrictive conditions per area instead of only processing it in a centralized node as outlined in [11,10].
Therefore, all our policy decision points, pertinent to the PDP-cloud and MPDPfog, manage the access taking into account the capacities provided by RBAC+ABAC +Risk-BAC [41]. In particular, the access prioritization is under the restrictions given by the RBAC-based access prioritization manager as specified in Section 3.1. This manager activates the DSD mechanism according to the risk evaluation given by the domain awareness manager, which includes four similar components to the context manager of the PDP-cloud. The main difference that keeps the awareness manager of the MPDP-fog regarding the PDP-cloud is that the domain awareness manager is mainly focused on locally computing the context at which the application scenario is being developed. The information processed by this module can be very versatile, the data of which can belong to the physical world (e.g. humidity, temperature, pressure, etc.) and/or the virtual world through software processes, software agents (e.g. through opinion dynamics [49,51]) or logs.

Suitability of the architecture for Industry 4.0 scenarios
Considering the control requirements specified in [7], this section analyses the suitability of the architecture proposed in Section 2 and its functions for future Industry 4.0 scenarios. In [7] five requirements for industrial control systems are identified: Realtime performance, dependability, sustainability, survivability and safety-critical; and for each of these requirements, the impact on the different elements and services of the system (information, resources, control, minimal services) is assessed. To adapt these five control requirements to our architecture, the analysis will primarily be focused on evaluating the five control requirements taking into account the primary needs of the new Industry 4.0 and the interconnection requirements defined in [9], such as rapid access, transparency in the connections, communication in real time, availability and reliability, also adding protection of devices and security in the multi-domain connections.
Real-time performance: One of the main goals of including policy decision points in edge computing infrastructures is precisely to decrease the number of connections to the different application domains. Entities connecting from the cloud, first need to validate their access. If the access is not proper, then the system denies the entry in the field/plant, thereby reducing the number of connections in the domains and unnecessary overloads. This feature is also contemplated in each domain individually where entities first has to locally authenticate in their premise, so as to later gain access to the resources of other domain, thereby protecting the access to constrained resources. On the other hand, the use of cache memories and different authorization mechanisms, in which access privileges are restricted according to roles, contextual conditions of each domain and risks associated with these domains, also avoid serious overheads that may hamper the operational and control processes and cause significant delays.
Dependability and survivability: The possibility of managing risks from a proactive and reactive perspective, allows the system to detect anomalies and response accordingly, ensuring availability of resources at all time and reliability of their services. Many of the anomalies come from the malfunctions or unsuitable configurations of systems or networks, or deficiencies in the coexistence of multiple systems [8], which may consequently bring about numerous security problems [15,50]. Moreover, this manner of offering automatic fault detection also adds a significant reduction of maintenance costs and benefits the future Industry 4.0 services allocated in the cloud, such as predictive maintenance and the optimization of operational services and equipment. In this case, our risk assessment and early warning managers should connect with external services to feed up any suspicious of threat, risk or anomaly, or could even connect with specialized cyber-security centers (e.g. computer emergency response teams such as the CNN-CERT [20] or the ICS-CERT [22]) to alert of extreme situations. Also related to cyber-defense, the use of cache memories aids to detect replay attacks by simply tracking the last PEP requests, IP addresses and timestamps as specified in [5]. And though the advanced security services are not extensively considered in this chapter, such as privacy and trust, they are also essential as part the M2M communications and particularly between cloud/fog-IIoT devices [55,46].

Sustainability:
The abilities of the system to manage risks and supply accountability capacities (see Figures 3 and 4) allow the system to provide a more reliable governance, auditing and forensic services. The records in each one of the incoming points of the system can determine the type of access in the field/plant, the actions carried out in the resources, the entities or organizations responsible for these actions, the access periods and abuses in the connections. These inputs can even feed up the risk assessment and early warning managers to estimate inappropriate actions, anomalies or threats, and this can also help the system to review its security policies and any regulation framework required to respond accordingly. Evidently, if this process is rigorously considered, the system can comply with the interconnection requirements at all time and be sustainable for the control; i.e. maintain control services at all times and for a long period, at an acceptable level for the protection of resources and critical infrastructures [7]. This sustainability feature is also supported by the abilities of each MPDP to translate protocols and security policies, and if, in addition, the corresponding modules are regularly updated, the system also ensures a tenable interconnection.
Safety-critical: In this aspect, we highlight the capacity of the system to protect the critical resources from external accesses, and especially when the domain hosting the resources present extreme crisis situations. Under these critical circunstances, it is always recommendable to recover and return the control [3,2] to the affected area, and to avoid, as much as possible, expanding the effect of the threat to the rest of interconnected domains, known as cascading effect. In addition to this, the management of proactive responses also aims to reduce possible secondary effects in the system or between systems, reducing the risk levels in advance [56] and any threatening effect that may entail a drastic cascading effect.
Taking into account all these control and interconnection principles, we consider that our architecture is suitable for the new control industry, in which a set of (IT-OT) technologies, protocols and networks have to coexist for a long period of time. From these technologies, we particularly focus on the cloud and fog infrastructures to accommodate the approach and reduce computational and communication costs, as well as enhance their resources to add additional capacities related to interconnection and protection in different terms and levels; all of them necessary for the new Industry 4.0 scenarios.

Conclusions and future work
A multi-domain interconnection architecture is proposed in this book chapter to connect multiple federated areas belonging to critical infrastructures (e.g. manufacturing industry and supply chains, food production plants, power grids and smart cities [39], and water treatment plants) without breaking the control requirements that generally these infrastructures demand. Typical domains are, for example, the generation, transmission and distribution substations configured as part of smart grid, or the different manufacturing sections corresponding to smart factories or supply chains. To do this, the architecture is based on a two layer interconnection system composed of two kinds of policy decision points; one located at a centralized system and another distributed throughout the different application domains. The centralized node corresponds to a cloud server capable of managing any entry belonging to external entities of the system or subsystems, such as customers, providers, auditors, etc.; whilst the distributed PDP are in charge of controlling any access coming from other domains or from the cloud.
This architecture based on two-layers incorporates in each PDP a set of functional modules with the ability to handle the access according to the characteristics and intentions of each entity together with their roles, the real state of the context and its resources, as well as the risks associated to this context. Therefore, the approach includes components capable of orchestrating aspects related to RBAC+ABAC+Risk-BAC with support for proactive solutions before serious interruptions may arise within the entire system. For the future, we intend to implement all these components in our laboratory [59] to later include them as part of the goals of the European SealGRID project [27]. And with this, show all the functionalities of the architecture for the new control industry, further considering the incorporation of specific services related to protection of communication channels (entities-cloud/fog, cloud-fog, M2M), privacy and trust.    Figure 4: MPDP-fog: Architecture and functional components