Telemetry-enabled Cloud-native Transport SDN Controller for Real-time Monitoring of Optical Transponders Using gNMI

This paper presents novel extensions for Cloud-native SDN controller for partially disaggregated optical networks that allow real-time monitoring of optical transponders. New SDN controller components are presented, as well as overall architecture. Evaluation is performed with gNMI enabled OpenConfig transceivers.


Introduction
Disaggregation of optical network components enables flexible solutions and provides efficient and cost-reduced network scenarios, which can help network operators to maintain a profit margin while revenues decrease. Current optical network deployments are still based on end-to-end solutions that offer little inter-operability [1] . Disaggregation provides operators a multi-vendor ecosystem that can address their requirements. In order not to suffer difficulties due to the system complexity two different models for network disaggregation are proposed: partial and full disaggregation. Full disaggregation consists on complete inter-operability of multi-vendor optical network elements, while partial disaggregation only considers the usage of Open Line Systems (OLS) and multi-vendor Transponders [2] .
Open Networking Foundation (ONF) has presented Open Disaggregated Transport Network (ODTN) project [3] , which is a partial disaggregated solution for open optical line transmission system. ODTN implements SDN architecture, with integration of ONOS-based SDN controller. The NorthBound Interface (NBI) of SDN controller uses HTTP-based RESTconf Transport API (T-API) YANG model to build optical topology, create and delete optical routing. The SDN controller interacts with the underlying network equipment using NETCONF OpenROADM YANG data models for OLS and NETCONF OpenConfig device model for multi-vendor transponders [4] .
NETCONF and RESTconf have been extensively used to control transport networks, but lately novel measurement and telemetry mechanisms for real-time monitoring have emerged. gRPC Remote Procedure Calls (RPC) is a cloud native high-performance RPC framework that uses HTTP/2 and protocol buffers encodings [5] . gNMI Protocol is built on top of gRPC and is able to configure and monitor network elements based on already defined YANG data models [6] .
Moreover, a novel cloud-native architecture has been proposed for SDN controllers [7] , which decomposes into micro-services traditional SDN controller. The usage of micro-services has been demonstrated in [8] as a power solution for handling high request loads, as well as introducing self-healing mechanisms in SDN architecture. Currently, ONOS is receiving a strong effort to redesign its core architecture to cloud-native enviroment in the µONOS [9] .
In this paper, the authors extend their previous work an cloud-native SDN controller [10] by introducing two novel components for transceiver management and telemetry monitoring. Telemetry monitoring will be performed using gNMI protocol and OpenConfig device model. A proof of concept is developed in a joint testbed between CTTC and KDDI Research. We demonstrate the feasibility of the approach by obtaining real-time telemetry measurements, as well as analysing scalability of the proposed solution.

Telemetry-enabled Cloud-native Transport SDN Controller
A Cloud-native Transport SDN Controller was presented in [10] . The proposed architecture only considered the control and management of OLS and did not take into account the necessary management of transponders and network monitoring. To this end, we have extended our architecture with two novel components, as shown in Figure 1. The proposed architecture is based on micro-services which interact among them using gRPC and protocol buffers through an integration fabric. An NBI micro-service is responsible for processing HTTP requests into the native protocol buffers and triggers the necessary actions to the corresponding micro-service. More details of internal workflows and benefits of the architecture are described in [10] . Transponder micro-service is responsible for handling control and management of optical transponders. Although a generic interface is provided to other micro-services, several South-Bound Interfaces (SBI) for specific transponders, using OpenConfig data models with NETCONF or gNMI protocols in our solution. Other SBI could be proposed, such as OpenROADM or IETF models. Connectivity micro-service is responsible for the triggering of the necessary configuration of transponders, once OLS connection has been established or removed.
In order to handle continuous recovery, storage and retrieval of monitoring data from various network equipment, we have introduced the telemetry micro-service. It provides an NBI that allows to subscribe either via polling or streaming to any available monitoring data available from underlying network elements, including transponders. Polling typically provides resource drain on devices, but is able to be implemented on legacy networks. Stream mechanisms, such as using websockets, or gRPC streams provide lower resource usage. In both cases, time-series or event-driven data can be obtained, depending on the desired implementation.

Real-time Monitoring of Optical Transponders
OpenConfig is an informal working group of network operators that collaborates in order to obtain vendor-neutral model-driven network equipment [11] . OpenConfig has provided a set of L0-L3 network equipment data models, as well as novel protocols for network elements and management, and their implementation.
Initial focus of the data models is describing configuration and operational state, using YANG, of switching, routing, and transport network elements. The optical transport data models provide a configuration and state model for terminal optical device, wavelength router, optical amplifier, channel monitor and protection switch.
In order to control and manage optical transponders the terminal optics device model is addressed. In the data model, a logical channel is a logical grouping of logical grooming elements that may be assigned to subsequent grooming stages for multiplexing / de-multiplexing, or to an optical channel for line side transmission. While, an optical channel corresponds to an optical carrier and is assigned a wavelength/frequency. Operational state monitoring is crucial for network health and traffic management. Examples include counters, power levels, protocol stats, up-/down events, inventory, alarms. In Figure 2, the available measurements for terminal-device are detailed (BER, SNR, chromatic dispersion), including proposed extensions on operational mode (differential-group delay, carrier-frequency offset). The gNMI protocol is described using protocol buffers and the transmitted data can be either encoded in JSON or in Protobuf. Below transactional data, gRPC is used, thus gNMI offers an efficient alternative to NETCONF and RESTconf protocols. A gNMI target (server) is typically the network device and gNMI client refers to the system that controls and manages the gNMI target.
In Figure 3, the sequence diagram is shown for subscription and recovery of telemetry at the extended cloud-native SDN controller. It consist of three phases: 1) Transponder registration; 2) Obtaining transponder telemetry; and 3) Retrieval of time series. Phase 1 is triggered by OSS/BSS, and the request is forwarded to transponder micro-services, which interacts with the transponder, through the necessary plugin. Later, the subscription of the transponder to telemetry retrieval is also requested. In phase 2, telemetry micro-service is responsible for autonomous retrieval of transponder monitoring data , through the transponder micro-service, which retrieves the information from the transponder (either in polling or stream mode). Finally, when data is requested (phase 3), it is retrieved from telemetry micro-service.

Experimental evaluation
The experimental setup consists of an SDN controller that has been deployed at CTTC premises in Castelldefels (Spain), and an SDN-enabled SDM/WDM transceiver deployed at KDDI Research in Saitama (Japan). The hardware setup is the same as described in [8] . For SDN control, OpenConfig YANG data models have been extended as shown in Figure 2. The gNMI target (server) and client (as part of transponder microservice) to control and manage the transponder have been developed using gNxI, which is a collection of tools for Network Management that use the gNMI protocol [12] . Figure 4 shows the retrieved pre-FEC BER at the control and managed transceiver. It can be observed that polling or stream intervals can be directly set-up using telemetry micro-service. In order to consider scalability of the pro-posed solution, we have evaluated the number of transponders that can be handled by a single instance of transponder micro-service. In Figure 5, the delay introduced in retrieving monitoring data of more than 16 transponders is excessive. This figure suggests that transponder micro-service in a cloud environment (such as the one used in the detailed cloud-native SDN controller, i.e., Kubernetes) should be replicated when more than 16 transponders are required.

Conclusions
We have proposed a novel cloud-native architecture for retrieval of telemetry of transponders based on OpenConfig and gNMI protocol. The proposed solution has been validated and benefits have been discussed.