Reliable Optical Networks With ODTN: Resiliency and Fail-Over in Data and Control Planes

This article reports a technical demonstration showing the use of the ONOS SDN controller for disaggregated transport networks, summarizing the latest developments and results within the Open Networking Foundation (ONF) Open Disaggregated Transport Networks (ODTN) project. The demonstration mainly covers the dynamic provisioning of data connectivity services and advanced automatic failure recovery, both at the control and data plane levels. For the provisioning, we demonstrate the usage of open and standard protocols and interfaces. For the recovery part, we first demonstrate, covering the control plane, how ONOS can behave as a logically centralized controller while using multiple coordinated instances for robustness. We show how the different devices remain under control even in the event of a failure of one of such instances, relying on the ATOMIX framework and the dynamic real-time negotiation of device mastership. For the data plane, we demonstrate the capabilities of the controller to perform automatic restoration of optical services.

commercial products but experimental hardware developed for research purposes. Whereas, [5]- [7] do not consider failure scenarios ans the work in [4] only considers, on the data plane, the node failure scenario.
In the demonstration reported in this paper ONOS first discovers the equipment through the NETCONF [9] protocol and the corresponding device YANG [10] models. Then, ONOS provisions a bidirectional lightpath, after receiving a request via its Northbound API.
Once the lightpath is established, the first goal of the demo is to demonstrate the robustness of the ONOS controller. Indeed, the typical ONOS deployment consists of a number of instances; during normal behaviour the instances collaborate operating in a continuously synchronized manner sharing the different operations and devices of the network thus increasing the system scalability. In case of failure of one instance the remaining instances take over the control of the underlying infrastructure. This is one of the main differentiating factors of the ONOS controller framework with respect to other SDN controllers. ONOS behaves as a logically centralized controller (thus enabling a seamless implementation of an application ecosystem, as expected), while internally managing a distributed and synchronized set of instances.
The second goal of the demonstration it to show the operations automatically performed to react to data-plane failures. In particular, the demonstration firstly simulates a fiber-cut through a port down command. Thus, the ONOS controller automatically reprovisions the path across the network to account for the failure. The re-provisioning, if a protection path was not previously setup, takes 500 ms for ONOS to compute the new configuration, and 900 ms for the devices to tune the lasers on the new paths, so a total of 1.4 seconds of traffic disruption for a non protected path is expected. The demo then simulates a device malfunction by erasing existing ROADM media channel configuration directly from the network element. In this case the ONOS controller detects the misalignment and automatically re-configure the device with its correct configuration. Finally, the demo simulates a node failure (i.e., the connection between the controller and a device is interrupted); In this case, lightpaths passing through the failed device are re-routed, while the controller continuously tries to re-establish the connectivity with the lost device.
The paper is organized as follows. First, Section II provides an update on recent developments of the ODTN working group. Then, Section III details the data plane deployment used in the demonstration. Section IV presents the SDN-based control plane architecture, highlighting the decomposition into logical instances. The procedures adopted by the SDN Controller to dynamically obtain the network topology and to provision connection services are respectively described in Sections V and VI. Then, Sections VIII and VII respectively report data plane and control plane recovery procedures, providing detailed description of the steps adopted by the controller. Finally, Section IX draws the conclusion.

II. RECENT ODTN ACTIVITIES
The ODTN working group was created at the ONF in 2018 for extending ONOS to support and monitor disaggregated optical networks. A detailed report of the ODTN activities up to summer 2019 can be found in [4]. This section summarizes the most recent ODTN activity. In particular, focus is placed on handling failures and achieving resiliency as described in this paper. New features were also developed, among all we integrated Bit Error Rate (BER) retrieval capabilities from OpenConfig Transponders. Pre-and post-BER Forward Error Correction (FEC) is collected and then exposed through REST APIs, Command Line Interface (CLI) and User Interface (UI). Integrating this parameter allows a more in-depth look at optical link state, enabling ease of diagnostics and analysis. BER is no available over every OpenConfig transponder.
The ODTN project also dedicates effort in expanding the pool of optical equipment that it can control and manage. For OpenConfig based transponders integration and testing was done with Fujitsu 1FINITY T100 [26] and Infinera Groove G30s [25]; for ROADMs, drivers were included in ONOS to control CzechLight equipment [23], as shown at TIP summit 2019. To allow a full end-to-end open source stack on top of white box hardware for packet-optical transponders the Stratum [24] operating system from ONF has been extended to support optical configuration through OpenConfig and gNMI. Such an integration included also drivers in ONOS. Thanks to the described effort all of the capabilities discussed in this paper are offered now with Stratum over the Edgecore Cassini packet-optical transponder.
ODTN has also been extended to be capable of leveraging the GNPy optical simulation and planning tool [22] for path computation across the optical domain. Through the use of GNPy ODTN is now aware of different optical impairments, such as fiber loss or device gain capabilities. The path is selected among the possible ones by using the GSNR value: the path with highest value, thus lowest signal interference, is chosen and configured in the network.

III. DATA PLANE DEPLOYMENT
The network used in the demonstration is comprised of two CASSINI optical transponders and two Reconfigurable Optical Add/Drop Multiplexers (ROADMs). The transponders are modelled as OpenConfig terminal devices [12] and controlled by specific drivers within ONOS that adheres to the OpenConfig YANG model definition [21]. The two transponder are then connected to the ROADMs which in turn are connected through redundant links to support data plane fail-over. As depicted in Fig. 1, the transponders are two Edgecore AS7716-24SC white box devices equipped with Lumentum CFP2-ACO Coherent Optical Transceivers on the optical side. The transponders, through OCNos from IPInfusion, expose a NETCONF API that models their capabilities and data through the OpenConfig YANG models. The Lumentum ACO card is integrated through a driver with the Transponder Abstraction Interface (TAI) [13] which exposes a high level set of APIs to configure transceiver capabilities. The transponders client ports are attached to emulated end-hosts to generate traffic and measure network state.
The ROADMs are Lumentum ROADM-20 white boxes that expose a NETCONF API described by a not-standard YANG model provided with the device vendor [6], so a specific ONOS  drivers that map high level applications commands to operations on the device have been implemented, and is now part of the official ONOS distribution.

IV. CONTROL PLANE ARCHITECTURE
The ONOS SDN Controller is deployed in a scenario with 3 instances, as illustrated in Fig. 2. Every instance runs identical java code in a Java Virtual Machine (JVM) inside its own Docker container. A group of ONOS instances is known as an ONOS cluster. The cluster shares state through a 3 instance ATOMIX partition set [14]. This ATOMIX cluster is again deployed in a 3 instance scenario with 3 other Docker containers running a JVM. Both ONOS and ATOMIX are deployed with an odd number of instances to avoid the split brain problem of distributed systems [15].
The described approach results in 6 docker containers for the total deployment. For this demonstration, all the docker containers run over a bare metal server with an 10-core x86 CPU and 64 GB memory capability. The server is connected through a separate management network to the network devices. In real deployments a number of servers or a Kubernetes cloud native environment can be used to host each required docker container to avoid single points of failure.

A. Atomix Distributed in Memory Database
In order to resist in case of control plane failure without loss of data ONOS leverages Atomix, a distributed, persistent, in memory database based on the RAFT [16] consensus protocol. Atomix is a fully featured framework for building fault-tolerant distributed systems. Combining ZooKeeper's consistency with Hazelcast's usability and performance, Atomix uses a set of custom communication APIs, a shared Raft cluster, and a multi-primary protocol to provide a series of highlevel primitives for building distributed systems and to solve many common distributed systems problems including cluster management, asynchronous messaging, group membership, leader election, distributed concurrency control, partitioning, and replication. Atomix abstracts the distributed aspect by exposing a series of Java APIs very keen to the Collection framework, such as EventuallyConsistentMap, Con-sistentMap, AtomicCounter and others. Such easy to use APIs are at the foundation of each subsystem in ONOS.

B. ONOS Distributed Stores
All information inside ONOS is saved in distributed stores. Depending on the requirements of a service, the algorithm used to store and distribute data between nodes can have different characteristics (e.g., strongly consistent, eventually consistent, etc.). This is made possible by Atomix exposing Java Collections having such characteristics thus allowing each service's store to implement the appropriate distribution mechanism according to it's data. Historically, the store for Mastership management used Hazelcast's distributed structures as a strongly consistent backend. Since ONOS 1.4 the Atomix framework is used instead. The stores for network state such as Devices, Links, and Hosts uses an optimistic replication technique complemented by a background gossip protocol to ensure eventual consistency. Simply put, the same subsystems of two different nodes synchronize directly with one another through the Store. The Store only synchronizes the states of the subsystem that it is part of. For example, a DeviceStore only knows about the state of devices, and does not have any knowledge of how host or link information is tracked. A different approach is instead taken with Flow Rules and configuration. Since these are edicts that must be enforced on the network a strongly consistent storage through Atomix is used, demanding the nodes to have knowledge of the state before any operation happens towards the devices, thus allowing controller nodes to fail at any time with no disruption.
Upon ONOS saving any information in the stores, the underlying Atomix receives it and shares it among it's partitions, over a single TCP connection in port 9876, ensuring consistency. In the case targeted in our demonstration, devices, ports, links, intents and flows each have their corresponding distributed stores. Thus, if a failure affecting an ONOS instance occurs, the information being shared through Atomix is not lost, since it is shared and can still be read by the remaining ONOS instances. When a new instance re-connects to the ONOS cluster, it can retrieve all data from the distributed stores in order to synchronize state.

V. NETWORK TOPOLOGY INITIALIZATION AND EQUIPMENT DISCOVERY
In our particular SDN deployment, the topology information, involving devices' ports and capabilities (transceivers, physicalchannels, identifiers, etc.) as well as links are discovered through OpenConfig interfaces based on NETCONF protocol.
ONOS receives from an external Operation Support Systems (OSS) or Business Support Systems (BSS) a Javascript Object Notation (JSON) encoded request containing the endpoint information of the different devices and the drivers (which defines the protocol/management interface to be used to discover the devices) to use for such devices. Fig. 3 shows an example JSON used to initialize the network topology vision of the ONOS controller including two devices and the connecting links.
Upon receiving the device endpoint information ONOS establishes connection and after a successful discovery (typically involving exchanges with the devices retrieving topological information), topological elements are stored in different ONOS stores: r ONOS topology store the topology graph consisting on edge and vertices.
r ONOS device store for device and ports information, also available media channels (optical wavelengths) are stored here.
r ONOS Dynamic Configuration Store (DCS), which can be later queried using a diversity of north-bound protocols, e.g., TAPI. The DCS storage is structured following pre-configured data models. In particular, ONF Transport API (TAPI) [17] data models, so TAPI data nodes such as Links, Nodes, Node-Edge-Points and Service Interface Points (SIPs) are exposed to applications and high level API consumers such as network orchestrators or operators' OSS and BSS, through a RESTCONF [18] interface. Other than retrieving SIPs, such interface may also be used to learn about the network topology and issue connectivity service requests, which trigger the establishment of data plane services (ITU-T fixed-grid Optical Channels and/or flexi-grid network media channels.)

A. Detailed Provisioning
Consider the provisioning of a service, involving two transponders and two Lumentum ROADMs. Initially, the devices are provisioned in the ONOS SDN controller, by using a dedicated REST interface (i.e., the interface to be used by external OSS/BSS systems). The devices are thus declared, and ready to be controlled by the SDN Controller. The first part involves the creation of the NETCONF session: After the session is established, the SDN Controller can retrieve the details of the devices, notably, device attributes, and data nodes like the vendor, serial number, hardware and software vendors, etc. At this stage the devices are added to the DCS as mentioned above.
After the initial discovery, OpenConfig and Lumentum ONOS drivers allow the discovery of ports, and port types, which are later modelled in ONOS internal topology model and can be retrieved via a TAPI interface.
Parameters of the device are next retrieved, like port types, tunability restrictions, operational state, etc. The following example shows retrieval of a ROADM port.
This step concludes the equipment discovery phase (including 5 devices) for which the ONOS controller takes about 1.7 seconds.

VI. OPTICAL CONNECTIVITY ESTABLISHMENT
The demo simulates an OSS/BSS that issues a connectivity request through TAPI to ONOS, to obtain end to end connectivity between two client-side ports of the transponders. ONOS processes the request, translates the received request into intents, stores into the distributed intent subsystem store and performs path computation and resource allocation (i.e., wavelength assignment), resulting in specific configurations of Transponders and ROADMs.
Such configurations are stored in the ONOS system in the form of flow rules. For example, for the transceivers optical ports these rules convey the DWDM grid, channel's central frequency and port number, as well as information required to configure a cross-connection between the client side and line side ports. Each device has an ONOS driver that maps flow rules into actual device configuration based on the underlying device data model (i.e., edit-config NETCONF messages are generated with the proper XML code). In particular, for the transponders, OpenConfig constructs are created and sent down to the device through the pre-established NETCONF connection.
As defined by the OpenConfig model, the line-side to clientside cross-connection is installed through a logical channel association, while an optical channel construct with frequency and power for the specific transceiver is to configure the Lumentum ACO card. Within the ONOS intent framework the configuration of the client side and of the line side are respectively mapped into an OpticalCircuit intent and an OpticalConnectivity intent [4].
After the configuration of both transponders and both ROADMs we show that the two hosts connected to the client-side

A. Detailed Discovery
At the provisioning stage, an OpticalConnectivityIntent has been pushed between the two CASSINI transponder passing thorough the Lumentum ROADM. This is mapped into a Flow Rule that uses Lumentum API to provision the cross-connect, as well as the OpenConfig devices. In particular note the selected OchSignal −35 × 50 GHz, which is translated into the 191.35 THz wavelength.
For this step, i.e., forwarding of computed configurations to the devices the ONOS controller takes about 1.2 seconds.
Thus adding the devices to the ONOS controller allowing ONOS to discover the devices details, requesting a service, configuring the devices based on the computed parameters and then removing the connection takes approximately 3 seconds. Fig. 4 shows a (small) part of the involved exchanged messages.
A more detailed analysis of the time contributions involved in the connection establishment process can be found in [4].
One can also remove the given configuration across the whole network. The following traces shows the value for the configuration being set to 0 and then removed by ONOS on the CASSINI devices and the same happening in the Lumentum ROADM.

B. Launch Power Configuration
During this demo we have also shown ONOS being capable of configuring launch power at each hop of the network, thus ensuring proper OSNR for the end to end path to be established. Currently the power calculation in done outside of ONOS and is manually inserted for each port in a vendor independent way through the PowerConfig behaviour.
Such behavior abstracts the underlying models. The following trace shows ONOS configuring a power at -6 dBs on the line port of the CASSINI through Openconfig.

VII. CONTROL PLANE FAILURE AND RECOVERY
Different types of failures can happen at the control plane layer, e.g., a process within one of the ONOS instances is faulty or the supporting physical server has failed. This can lead to subsequent failures and undefined behaviour of the system. In this demonstration, we focus on ONOS deployed in a multiinstance scenario. In such scenario ONOS is capable of handling instance failure by leveraging shared state and changing device mastership.

A. Device Mastership
In ONOS each instance of the cluster has a role against all of the devices, the roles are MASTER, STANDBY or NONE. Every device has a single MASTER instance at a given time, while all the remaining instances stay in STANDBY mode. The Master is elected for a device by the MastershipService through an Atomix based election mechanism. Only the Master of a given device can act upon it by applying configuration.

B. Demo Scenario
Initially, and as Fig. 5 shows, 3 docker containers of ONOS control the network together. The nodes are known to each other, please note the * to indicate the current node.
ONOS relies on raft partitions to replicate, share and distribute information.
ONOS_1 and ONOS_3 control Transponder_1 and Transpon-der_2 respectively, having state synchronization between the instances. One ROADM is managed by ONOS_2 while the other by ONOS_3.
At step (1), ONOS_1 docker container is killed resulting in the other instances receiving the notification for that instance with INACTIVE state.
Transponder_1 goes immediately out of management because no NETCONF session is open to it. Mastership of Transponder 1 is then moved to active ONOS_2 via negotiation at step (2). The new master will firstly read the state information of the device in both ONOS traditional datastore and TAPI data tree. Finally at step (3), ONOS_2 establishes a new NETCONF channel with Transponder_1 that was controlled by the deactivated ONOS_1.
At this point ONOS_2 is again in full control of Transpon-der_1 and can react to events and provision configuration on it. The timestamps in the logs show a time of 751 ms from ONOS_1 gets recognized as down to when the connection gets re-established. This measurement averages the 700-800 ms time in all the experiments we have done. The 751 ms are spent for mastership election, session re-establishment and store notification.
The whole recovery procedure enables a cluster of ONOS instances to maintain control of the network at any given time with no inconsistent state. The control plane recovery, as the data plane one, is completed automatically without operator intervention.

C. SSH Reestablishment
ONOS ensures a continuous connection with the underlying network devices. In our demonstration, the NETCONF SSH session times out according to the device's own configuration. ONOS, upon detecting the loss of the SSH channel proceeds to immediately re-open it from the master instance. This procedure is done automatically and ensures continuous connectivity each device in the network ensuring immediate reaction to failures. The following trace shows ONOS reestablishing connection with the different elements.
ONOS also provides an exponential back-off for the retry. Upon a certain number of failed retries ONOS marks the device as OFFLINE, notifying all the listeners and removing it from any future path computation.

VIII. DATA PLANE FAILURE AND RECOVERY
The internal architecture of ONOS is layered, where multiple systems offer services to higher level systems and applications. In particular, a flexible and advanced system in ONOS to provision services is based on Intents, i.e., high level descriptions of data services [11]. The ODTN project application maps north bound interface requests using the T-API protocol to optical intents, re-using the existing intent framework. More specifically, OpticalCircuit intents are used to model connectivity between two client transponder ports, and OpticalConnectivity intents are used to model connectivity between line port of transponders passing through intermediate ROADMs [4]. By using optical intents, we leverage the automated failure recovery built into the ONOS intent framework, that underlays all the several types of intents provided by the platform, optical included. Fig. 6 shows the finite state machine that ONOS uses for intents installation, compilation and failure detection [11]. For the purpose of this work, it is important to note that an Installed intent may move to the Recompiling state in case of an affecting event on the topology (e.g., a fiber-cut); recompilation may result on the computation of a different path along which the traffic may flow and the desired intent can be recovered. If a new path is present and compiled it will be installed in the network as show previously.
In the following, we detail the different aspects of the demonstration.

A. Loss of Configuration
With the traffic flowing we simulate a loss of configuration in one of the Lumentum ROADM, by manually removing, using its Command Line Interface (CLI), the ONOS installed cross-connection. This emulates a device reboot, a loss of configuration or a mis-configuration in which a given service is removed from the device. ONOS has periodic polling and automated checking of established configurations (flow rules). Thus, the loss of configuration event is picked up by ONOS during its periodic state reconciliation. During the demo we set such period at 5 s, but it can be brought even lower to achieve optimal performance, at the cost o putting more strain on the network. After ONOS picks up the absence of the expected configuration (flow) in the ROADM it immediately sends a request to re-install the same configuration.
The provisioning follows the same procedure as in Section V-A. This operation can take at most the configured period of state reconciliation (e.g., 5 s) plus O(500)ms for ONOS to re-compute the path as shown in the previous logs to add to the amount for the device to save and re-apply the configuration. As expected the recovery time is directly related to the polling frequency. Faster recovery is possible at the expenses of increased message rate of the polling.
A further measurement is presented though a snippet of the ping command such command gives out a millisecond timestamp per each ping and allows us to corroborate the measurements obtained by the ONOS logs.
It can be noted that the total amount of time is 1.4 s, the deletion of the ROADM state was issued at 16:43:48.140, thus it took ONOS a 400 ms to recognize the fault, then the 100 ms to recompute and reinstall. The device took 900 ms to re-tune the laser after the configuration push from ONOS.

B. Fiber-Cut
With the traffic flowing we simulate a fiber cut. The fiber cut is simulated by switching one of the ROADM's ports to a down state via the device Command Line Interface (CLI), completely separate from ONOS. Such port-down event is recognized by ONOS that in turn marks the link as failed, broadcasting a LINK_DOWN event. The link event gets parsed by the optical path computation module inside the Intent subsystem that, with the updated topology recomputes the path (i.e., using remaining devices and links). Such re-computation is automatically triggered by the LINK_DOWN event, which is processed by the controller with no human intervention. Upon re-computing the path according to the available topology and selected algorithm ONOS installs the required rules in the devices along the path. The provisioning follows the same approach of Section V-A.
Same measurements of times based on pings as shown in VIII-A have been done for the fiber-cut scenario resulting in a 1.9 seconds of disruption. The following capture shows how a total disruption time of 1.9 seconds in the case of fiber-loss, with 200 ms spent in signaling the loss to ONOS, 800 ms to re-compute the path and the expected 900 ms for lasers to be re-tuned.

C. Device Failure
The same mechanism is used in case of device failure. ONOS tries to avoid the failed resource like in the previous case. In particular, when the ONOS controller loses the TCP/SSH connection to a device it considers such device to be in a failed state. Device disconnection triggers a DEVICE_DOWN event that triggers the recovery of intents traversing the failed device. ONOS tries to re-use as much as possible of the existing path, configuring on this fallback path the same wavelength for ports and links. All of the event handling, path computation, fail-over scenarios is done in the ONOS intent subsystem. Also in the case of DEVICE_DOWN event the provisioning happens as in Section V-A. In this scenario, the recovery time is also dependent on the actual mechanism to detect the liveliness of the device. Relying of default mechanisms associated to the transport protocols (TCP, SSH) the detection of a failed device can take several seconds or minutes. Using Keep-Alives (directly if the protocol supports such feature or polling the device using operations in order to obtain a reply) can reduce this time but again at the expenses of increased rate of messages.
The case of a device failure yields recovery measurements close to the fiber cut because the procedure in ONOS is the same.

IX. CONCLUSION
This demonstration has shown the use of the ONOS SDN controller to provision data connectivity services across a disaggregated optical network with real hardware exposing common and open data models. Such deployment enables advanced recovery, both at the control and data planes. For the former, the recovery leverages ONOS distributed stores and offers high availability through Atomix, providing a level of robustness against failures of the control processes. For the latter, ONOS offers failure detection mechanisms that immediately react to network failures re-configuring the overall network to continuously provides end-to-end services with a disruption of 1.9 seconds in the case of data plane failures and no signal disruption in case of control-plane failures.
This demonstration showcased the first example of a resilient control plane for optical networks achieved through the use of open source software, open APIs and open source device models. The chosen APIs, protocols and models reflect also the current requirements from service providers and offerings of the Optical industry.
Overall, this work showed the feasibility of the selected approach while further work is required to evaluate the scalability of the system with a realistic number of network devices and a realistic traffic matrix. He started his carrier in Telefónica I+D as a Researcher in 2004. He became an Assistant Professor at UAM in 2009, working on metro-backbone evolution projects. In 2011, he joined Telefónica I+D as a Technology Specialist working on funded research projects from the Telefonica group and the European Commission He is currently a Technology Expert at Systems and Network Global Direction in Telefonica gCTIO. He works on the global IP and transport processes of the Telefonica group. Moreover, he is involved in the research projects as well as in the definition of the Telefonica network guidelines for IP/optical segment. He serves as the co-chair for the Open Optical Packet Transport group in the Telecom Infra Project. He has co-authored more than 200 publications, five patents, and contributed to IETF and ONF. Moreover, he is the Editor of the book Elastic Optical Networks: Architectures, Technologies, and Control, Springer, 2016.