A flexible information service for management of virtualized software‐defined infrastructures

There is a major shift in the Internet towards using programmable and virtualized network devices, offering significant flexibility and adaptability. New networking paradigms such as software‐defined networking and network function virtualization bring networks and IT domains closer together using appropriate architectural abstractions. In this context, new and novel information management features need to be introduced. The deployed management and control entities in these environments should have a clear, and often global, view of the network environment and should exchange information in alternative ways (e.g. some may have real‐time constraints, while others may be throughput sensitive). Our work addresses these two network management features. In this paper, we define the research challenges in information management for virtualized highly dynamic environments. Along these lines, we introduce and present the design details of the virtual infrastructure information service, a new management information handling framework that (i) provides logically centralized information flow establishment, optimization, coordination, synchronization and management with respect to the diverse management and control entity demands; (ii) is designed according to the characteristics and requirements of software‐defined networking and network function virtualization; and (iii) inter‐operates with our own virtualized infrastructure framework. Evaluation results demonstrating the flexible and adaptable behaviour of the virtual infrastructure information service and its main operations are included in the paper. Copyright © 2016 John Wiley & Sons, Ltd.


INTRODUCTION
The Internet is gradually evolving by incorporating flexibility and programmability within the network infrastructure (such as software-defined networks, [1] and [2]), in data centres (using Cloud technologies, [3]) and in the content distribution (with information-centric networking, [4]). Software-defined networks introduce a new network paradigm that decouples the control plane from the data plane, whereby the control plane has either centralized or distributed programmable components that realize a global view and global management of the network. The most prominent solution for softwaredefined networking (SDN) so far, OpenFlow [1], is being deployed on both commercial [5] and experimental/academic networks [6], [1].
The adoption of virtualization technologies is gradually coming to network elements as virtual network functions (VNFs). That is, the network function virtualization (NFV) concept [7] allows deeper integration of networks with IT domains and their related operations, by creating network functions within a virtual machine. This approach allows significant cost savings and more flexibility in service provisioning. SDN and NFV can be seen as mutually beneficial, as they may coexist within an management and control components, called MCEs, that are used for exploiting, handling and communicating (i.e. between each other) management information. The MCEs are characterized by diverse requirements in terms of information manipulation. For example, a system network management operation that mitigates failures may require real-time constraints in information consumption, while a daily network routine process may have no such constraints and be delay tolerant.
Currently, each software entity designer uses his or her own ways to communicate and process state information. These tailor-made information handling facilities may be efficient for specific software tools but are associated with the following disadvantages: They are not adaptable to the existing network conditions and the quality of service requirements of the participating MCEs. There is no way to prioritize communication of MCEs essential to overcoming a problem, when faced with systematic problems (e.g. network congestion or failure). The collected/processed state information may not be available to other MCEs that can benefit from the information, or the information may not be represented in a compatible format. The different information manipulation processes cannot be optimized in a collective manner.
We propose that such information communication interchange and handling can be abstracted away within a logically centralized information management infrastructure that has been designed considering the unique characteristics of this new wave of challenging SDN/NFV paradigm and technologies.
In other words, the information manipulation should have similar characteristics in terms of flexibility, adaptability, scalability and stability as the network environment it supports.
One may argue that it is difficult to have one solution that satisfies the many diverse requirements in information handling that MCEs may have. A characteristic here is to design the appropriate abstractions that allow a number of information handling strategies to coexist. Such a solution should select the most appropriate strategy each time. The selection should be dynamic as well as adaptable to the network conditions and MCE requirements.
Another characteristic is the minimization of communication overheads, by evaluating the extra overhead that is potentially introduced by any change, for example, the communication cost to select and evaluate the most appropriate information handling strategy and to maintain the global picture. Our proposal allows simpler ways to communicate information and process information, as the MCEs may choose a direct communication option, overriding the communication through the logically centralized entities. However, the information handling service should have the option to override the MCEs' choices, in case this contradicts with the global viewpoint (e.g. a different global performance goal or systematic failure).
Other characteristics that such a new information management solution should include the support of are service deployment that is adaptable to the resource availability and user/system requirements; seamless integration within SDN/NFV and related technologies (such as software-defined data centre and cloud computing and networking); a number of closed management control loops, including those for information manipulation; tuning of the involved performance trade-offs at global and local levels; tackling of scalability and stability issues; reduced footprint and lightweight messaging.
These characteristics, together with the four research challenges for the future evolution of the SDN/NFV technologies, have been factored into the design of our flexible information service for virtualized software-defined infrastructures.

A flexible information service for virtualized software-defined infrastructures
In this paper, we introduce a new framework that targets the previously defined research challenges and characteristics, manifesting an abstracted and adaptable facility and released as an open-source software (i.e. available at [19]), called the VIS. The VIS provides information manipulation capabilities to the software entities acting as consumers and/or producers of information, namely, the MCEs, which include SDN applications/controllers [2] and network and service management applications. The MCEs may be embedded at the network devices and virtual routers or deployed at the physical hosts (as distributed or centralized components). These MCEs may contribute to building the global picture of the environment and/or retrieve either high-level information abstractions or low-level state information. To achieve such functionality, the VIS supports the following facilities: information collection, information aggregation (IA) and information dissemination; information storage, information indexing and information processing (such as initial attempts at knowledge production (KP), where we define knowledge as the global picture information for the network environment); information flow (i.e. the management information flow) establishment and optimization, supporting both global and local tuning of involved performance trade-offs; alternative communication methods between information sources and information sinks, such as the push/pull, publish/subscribe (pub/sub) and direct communication methods; logically centralized path optimization for the management of information flows; interfaces for both information exchange and management information flows regulation; accommodation of extensions, in an architecture aligned to both physical and virtual network environments, that can improve its behaviour further; The VIS organizes configurable information flows between a large number of entities, acting as information sources and information sinks (i.e. the information flows communicate management information and are not the data flows of the networks data plane). Those entities initially specify their requirements and constraints (maximum data rate, requested communication method, etc.) at start-up. The VIS matches sources with sinks, finding the most appropriate data paths based on the global picture of the network, while specifying the details of the corresponding flows. Furthermore, the information flows are optimized at a global level, but with respect to each source/sink expressed requirements and constraints. At any point, the VIS may trigger renegotiations and flow re-establishments of some or all of the management information flows, in the case of a different high-level performance goal decision for the system or because of an unexpected event, such as a failure.
As a summary, the VIS supports the following novel features: 1. It is specially designed to meet the characteristics of dynamic environments (e.g. SDN/NFV and clouds and/or hybrid deployments such as software-defined data centres/cloud computing and networking). 2. It supports information flow establishment, operation and optimization between the MCEs. 3. It provides logically centralized management of established information flows with respect to the diverse MCEs' demands. 4. It provides explicit support for adaptability, stability, scalability and flexibility characteristics. 5. It operates over SDN and NFV infrastructures, and it augments them at both the level of their northbound interfaces and controller levels with information handling capabilities.
This paper discusses the main research challenges in information handling for integrated SDN/NFV environments. Furthermore, it introduces the first relevant abstracted solution addressing these challenges so far, namely, the VIS, along with its architecture, interfaces and important design details of its sub-components and their interactions. It includes detailed descriptions of the VIS software components, the sub-components, the interfaces and the associated data exchange, interactions and operations between these components. The paper also includes representative VIS proof-of-concept results on its main operations and a functional validation analysis. We have documented the implementation details of VIS and an extensive experimental analysis of its non-functional characteristics in a complementary paper [20]. The two papers complement each other in terms of problem statement plus design details and pragmatic implementation and large-scale experimentation with functional and non-functional features, respectively. Furthermore, paper [20] presents VIS in the context of information handling for and as an extension of the NFV Management and Orchestration (MANO) architecture [21] from the European Telecommunications Standards Institute (ETSI). Our VIS implementation is inter-working, coupled and tested with the very lightweight softwaredriven network and service platform (VLSP) [22,23], our framework that integrates NFV with SDNs. VLSP consists of two main management layers; these layers, besides the proposed VIS, are the following: (i) the virtual infrastructure management (VIM) layer, which provides management and orchestration of virtual networks, and (ii) the lightweight network hypervisor (LNH), which provides SDN-enabled lightweight virtual routers and links that are suitable for large-scale experiments. Paper [23] introduces and motivates the complete three-layer infrastructure and includes a brief discussion of VIS. The detailed design of the VLSP features, besides the VIS, and an evaluation of its main operations are presented in [22].
The paper structure is as follows: Section 2 contrasts the proposed framework with the related works; Section 3 gives the design details of the VIS; Section 4 discusses our experimental methodology and highlights experimental results validating the VIS and its main operations in terms of adaptability and flexibility; finally, Section 5 briefly discusses our next steps and concludes this paper.

RELATED WORK
Software-defined networks are associated with a major architectural shift in the Internet, such as the decoupling of network control from the data plane. Logically centralized control is gradually replacing the distributed self-organized way that the Internet behaves, although this is only within the boundaries of a single organization domain (enabling decision making aligned to its organization structure). They exhibit the following properties [1]: Logically centralized intelligence: Network control has been decoupled from forwarding and resides at a logically centralized component, allowing decision making based on a global-level or domainlevel view of the network. Programmability: Networks are controlled by software functionality enabling application interaction within the network. Abstraction: Applications consuming SDN services are abstracted from the underlying network technologies.
Very elaborate surveys in the field of software-defined networks have been presented in papers [24] and [25]. Most deployments so far adopt SDN applications realizing centralized traffic engineering and achieve impressive performance boosting. For example, the average utilization of network links in software-driven wide area networks has been increased from 30-40% to near 100% [5]. From our point of view, the research focus should gradually move from the phase of the software-defined networks to the software-driven networks, where novel network management approaches can bring services and networks closer [26]. This calls for new network architectures, standardized interfaces and management applications focusing on aspects beyond traffic engineering. NFV [7] concepts and systems allow a deeper integration of networks with the IT domains and their related operations. It shares with SDN the virtualization and programmability aspects and characteristics. An architectural comparison between ONF SDN, ITU-T SDN, IRTF SDN, ETSI NFV [27], OpenStack Neutron and OpenDaylight [28] architectures is presented in [23]. The approaches related to the information-centric networks [4] focus on the data flows in the network rather than on the management information flows between the MCEs. It is important that these features should be supported by the deployed management and control components, including SDN applications, SDN controllers and information handling infrastructures. A global picture at a network or domain level should be maintained, using logically centralized intelligence, programmable ways and abstracted design. This means that network abstraction should be associated with abstracted network information manipulation.

Abstracted information management for software-defined networking/network function virtualization
The authors of [29] assess how SDN control state inconsistency significantly degrades performance of logically centralized control applications. They identify two associated performance trade-offs: (i) between control application performance and state distribution overhead and (ii) application logic complexity versus robustness to inconsistency in the underlying distributed SDN state. In their work and many others, information management is tightly integrated within a corresponding SDN application or an SDN controller (e.g. for traffic engineering). For example, in [6], the proposed infrastructure (i.e. Procera) collects information through the Simple Network Management Protocol (SNMP) and particular kernel data structures exported to the user-level context of the operating system. Those event sources communicate with the controller by periodically sending files along with timestamps. Google documented an outage incident in its SDN wide area network deployment. According to [5], the outage could have been avoided if latency sensitive operations have received higher priority, compared with the throughput-intensive ones. Furthermore, the problem was detected at a very late stage, because of the lack of enough performance profiling and reporting. Along these lines, we suggest that SDN applications and controllers should be able to communicate information based on their own diverse requirements and constraints using an abstracted information management facility.
The following are solutions that are moving towards more abstracted ways to communicate information between the MCEs (e.g. SDN applications and controllers). Atlas [30] uses mobile agents that collect netstat logs and pass that information to the control plane. The latter uses an automated data consolidation and classifier generation logic. PANE [31] allows users, hosts and applications to express requests for resources, hints about future traffic and queries for current or future properties of the network to a logically centralized network controller. It introduces a corresponding end-user API. ALTO [32] hosts aggregated information to which each controller has a link. Its main goal is to guide applications to select one out of several hosts capable of providing the desired resource. HyperFlow [33] selectively publishes events on system state changes; other controllers replay all the published events to reconstruct the state. The XSP [34] protocol provides general and extensive support for interactions between applications and network-based services and between the devices that provide those services. XSP includes a 'call setup' phase, indicating connectivity requirements. Onix [35] uses a Distributed Hash Table (DHT)-based solution, supporting group membership and distribution mechanisms. It uses a data model that represents the network infrastructure, namely, the network information base.
In the following section, we discuss solutions that implement information management and monitoring over virtualized environments.

Information management and monitoring for virtualized infrastructures
Information management is an enabling technology for management functions achieving autonomicity and sophisticated resource optimization within virtualized infrastructures (such as clouds). Many different aspects relevant to information handling should be studied in parallel: information flow optimization, information representation, event-based and pub/sub mechanisms and monitoring functionalities. Information management is considered as a crucial component of autonomic management architectural models in virtual environments, as proposed in a number of research projects and studies, for example, in AutoI [46], RESERVOIR [3] and 4WARD [47]. Clark [48] introduces a knowledge plane architecture for the Internet, while in [49], an information management overlay is proposed, which is a knowledge plane providing focused-functionality management applications with the necessary information for realizing self-management capabilities.
Monitoring is associated with information management and is a major issue in a network virtualization [50]. Existing monitoring systems such as Ganglia [51], Zabbix [52], Nagios [53], MonaLisa [54] and GridICE [55] have addressed monitoring in large distributed systems. They are designed for the fixed and relatively slowly changing physical infrastructure that includes servers, services on those servers, routers and switches. However, they have not addressed or assumed rapidly changing environments that VIS considers (i.e. in terms of requirements and resource constraints). In this direction, VIS is integrated with the Lattice monitoring system, which has already been tested in diverse dynamic environments (such as clouds [56] and other virtualized infrastructures [57]).

Our approach contrasted to the related works
All of the aforementioned solutions seem to work well within their targeted contexts. The proposed VIS goes one step further: it abstracts information manipulation further away within such environments. This requires not only supporting alternative methods to create the network-wide state but also a flexible way to choose the most appropriate configuration each time. It allows each MCE to specify its own requirements and constraints. The VIS infrastructure is flexible enough to accommodate diverse needs, because there is no single method that works everywhere. However, VIS maintains a global view of the management information flows and may revoke configuration settings, when the global performance is impacted. Additionally, VIS supports most of the features of the aforementioned proposals, such as alternative communication methods/data storage, an information flow setting up/negotiation phase, indexing of information, dynamic monitoring and open end-user APIs. A more detailed description of the proposed VIS follows in the next section.
Our proposal augments the SDN/NFV architecture with a decoupled information manipulation component, the VIS. It is not competing with OpenFlow-oriented research; rather, it extends it at both the level of the SDN/NFV northbound interfaces and the SDN controller [2]. The proposed framework is not constrained by well-established SDN technology (such as OpenFlow), because it targets and operates as an augmented higher information handling framework over wider SDN/NFV infrastructures and applications base.

THE VIRTUAL INFRASTRUCTURE INFORMATION SERVICE DESIGN DETAILS
This section presents the design details of the proposed infrastructure. There is a particular focus on the features enabling logically centralized, programmable and abstracted information manipulation. The VIS architecture, its sub-components and the associated interfaces are described and explained in detail in the following subsections.

Virtual infrastructure information service architecture, interfaces and functions
The VIS architecture is described here, first discussing the type of entities that use the VIS and then highlighting the interfaces and the main core functions.
As we show in Figure 1, the three types of MCEs communicate with the VIS: (1) the high-level management applications, which are responsible for the efficient operation of the whole system and take optimization decisions based on the global picture, such as the SDN applications; (2) the MCEs  (3) the MCEs deployed at virtual routers, which are responsible for resourcefacing operations at the virtualization hypervisor level. The third MCE category covers the need to deploy a number of entities close to the resources. For example, in [58], the authors place an embedded module on OpenFlow switches close to the failure's source in order to reduce computational and network resource consumption. The VIS uses the following two separate interfaces for communication with the MCEs: (1) The information management interface is used for information manipulation configuration, including the MCEs registration to the VIS (i.e. expressing requirements, capabilities and constraints in terms of information handling). This interface is associated with both internal VIS management activities and external to the VIS information flow configurations and acts as the interface to the proposed logically centralized information management facility. Detailed descriptions on each of the main VIS functions are included in the following subsections.

The information collection and dissemination function
The ICD function is responsible for information collection, sharing, retrieval and dissemination. The ICD is the communication front-end of the VIS, residing behind the information exchange interface and supporting alternative ways to communicate information, which can be configured in order to meet certain performance profiles. The basic information communication mode between MCEs and VIS imply a proxy-like interaction, where all communication is being handled via the VIS. However, VIS also supports a communication method where it can redirect information querying MCEs towards the appropriate resources (i.e. other MCEs) for direct interaction. Either the interested MCEs may request such a communication method from the VIS or the IFEO function may enforce it transparently for optimization purposes. New ICD algorithms (e.g. for collection or dissemination) can be added in a plug-and-play fashion. This process gives flexibility to the VIS infrastructure to meet new information manipulation demands, as soon as they arrive. The main ICD operations are elaborated as follow.
Information retrieval The VIS retrieves information from a number of MCEs according to their information collection constraints while it regulates communication to meet certain information collection and information sharing requirements.
Information collection is triggered from the VIS, whereby the VIS gathers the relevant information from an information source. Such a collection could take place in response to an information retrieval request from an MCE (often in the case where the requested information is not available in the VIS storage).
Information sharing is triggered from the MCE information sources. In this situation, the VIS is being given the relevant information, as such an information source shares the information with the VIS. An information source may share new information with the VIS or can update existing information.
Information retrieval can be one of four types: (i) 1-time queries, which collect information that can be considered static, for example, the number of CPUs in a server; (ii) N-time queries, which collect information periodically for a certain number of times; (iii) continuous queries, which collect information in an ongoing manner; (iv) unsolicited acquisition of subscribed information units; and for each of these four types, there can be information collection or information sharing.
The MCEs can provide information for collection or sharing with the VIS, set their information collection constraints during their registration phase or update the constraints during an MCE configuration update phase (e.g. to respond to a network event-like congestion).
Information collection is where the VIS gathers the relevant information from an information source, and is triggered from the VIS. Such a collection could take place in response to an information retrieval request from an MCE (often used when the requested information is not available in the VIS storage).
Information sharing is where the information source shares the information with the VIS and the VIS gets given the relevant information. It is triggered by the MCE information sources, and they may share new information with the VIS or can update existing information.
Information dissemination The ICD function performs information dissemination to a number of MCEs that act upon this information, for example, performing configuration changes. The MCEs can request information from the VIS, which may be available in the VIS storage or can be indexed.
The information sinks register their information retrieval requirements during their VIS registration or configuration update. This process enables the corresponding MCEs to act as information sinks. The information can be disseminated using one of the following methods: (i) Pull: The MCEs may request information using the pull method by explicitly requesting a particular type of information and obtaining the data as the response. They can make these requests either on a periodic basis (polling) or when a certain demand arises. Two types of pull method are currently supported: (1) the pull from entity method, where the VIS pulls that information from the source on behalf of the sink, and (2) the pull from storage method, where the VIS pulls that information from the VIS storage. The VIS can be configured to dynamically switch between the two methods (i.e. to retrieve the information from the source when not in storage or vice versa); (ii) pub/sub: The MCEs can subscribe to receive a certain type of information. That is, they are automatically informed when this information appears or changes. Filtering and information accuracy objectives may be applied (e.g. to follow changes that are higher than a particular threshold). The information sinks define their information requirements during their MCE registration or a configuration update. The MCEs maintain the information in their local storage, from which they either service further information processing or act upon the new information; (iii) Direct communication: The MCEs can communicate directly, when necessary, rather than sending data via the VIS. The MCEs are signalled by the VIS to communicate directly, when the information flow establishment phase enforces that particular method. Although MCEs can have direct information exchange, this process may be revoked by the VIS as it is still in overall control of all the management information flows. This may occur in the case of changes in the network or to the requirements.
An example of ICD usage is shown in Figure 2. Following the registration of an information source MCE (left) and an information sink MCE (right), the VIS communicates management information flow configurations to both entities signalling direct communication, containing the location of the information source as well. Such location information is part of the information availability registration of the source entity and is retrieved through the ISI function. The IFEO function oversees this information establishment process, considering active information flow communication measurements, the global performance goal in the system, and the most recent registration information of the available entities.

The information storage and indexing function
The ISI function is a logical element representing a repository for registering MCEs, storing and indexing information. The ISI function stores information at various levels of abstraction, ranging from MCE registration information and raw network state data to processed and global picture information. The ISI functionality includes methods and functions for keeping track of MCEs, including information registration and naming, MCEs configuration, information directory and indexing. An important storage aspect, which can assist the production of higher information abstractions handled by the IPKP function, is the inherent support of historical capabilities. For example, an MCE could request information that was stored in the past using an appropriate timestamp.
It should be noted that the KP functionality is not part of the ISI function; rather, the ISI offers storage facilities for knowledge derived because of some earlier calculations. The ISI optionally stores knowledge produced from the IPKP function, in the form of global picture information regarding the network (e.g. as an input to sophisticated inference mechanisms).
The different MCEs, which request or store information to the VIS, do not directly communicate with the ISI. The ICD function handles information collection or dissemination between the storage points and the MCEs.
The VIS parameterizes the ISI function, based on the current conditions or global performance goals. The ISI may be characterized by alternative structures or configurations (e.g. number of storage nodes, in case of a distributed storage) that are suitable to a particular environment or network condition (e.g. the presence of congestion in the network). The main ISI operations are elaborated as follows: MCE registration All MCEs must be registered to the VIS. This process allows the MCEs to express their information manipulation requirements and their capabilities. The ISI function maintains an MCE registry, including specifications for the available information to be collected, retrieved or disseminated. If the MCE is already registered, the particular MCE configuration is updated. This process may result into renegotiations of existing management information flows, in order to meet the updated requirements and constraints. In this paper, a similar case is studied experimentally in Section 4. Information storage/location The information communicated through the ICD function is optionally stored in the information storage. Once stored, it can be passed back to the ICD function for dissemination. Alternatively, information needs not to be stored immediately but could be stored after the completion of an IA or knowledge (i.e. we consider as knowledge the global picture information) production process. The information communicated through the ICD function is optionally processed to create information locaters. These locaters are pointers to the original data rather than containing the actual data. Information locaters can be collected as part of an IA or KP operation. Furthermore, they may be used in the establishment of a direct communication flow between MCEs.
In Figure 3, we show example interactions of the ISI function. We show an MCE collecting information from the network and passing it through the ICD function to the VIS storage, handled by the ISI function. At some later stage, another MCE retrieves that information through the ICD function, which in turn retrieves it from the ISI function storage. This example is using the pull from storage method. From the figure, we see the IFEO function retrieves periodic storage-related measurements from the ISI (i.e. system-oriented input) and the updated MCE registration information (i.e. local-oriented input) and can trigger configuration changes in the ISI (such as additional or relocated distributed storage nodes, or the choice to use an alternative storage technology, etc.).

The information processing and knowledge production function
The IPKP is responsible for operations related to information processing (e.g. aggregation) and KP. In network environments, there is a frequent need to aggregate information in order to take decisions concerning a wider network scope (e.g. those based on the average link load in a network domain). In VIS, this is overseen from the IA component of the IPKP function. The IA receives the collected data from the ICD function and processes them out before they are stored through the ISI function or disseminated. The data may be filtered at the IA level. That is, this reduces the volume of measurements by only sending values that are significantly different from previous measurements. Furthermore, the IA component itself can be flexible enough to be given different aggregation specifications by a higher level management application (e.g. a governance application) in order to process the data in a varying way. For example, it can be configured to wake up once an hour and select data for the last day and then apply an aggregation function. This is achieved using a mechanism that relies on plug-ins. As well as requesting information, an MCE may subscribe to the event-based notification service (i.e. the pub/sub mechanism) by setting an appropriate threshold to a specific type of information. Whenever this threshold is exceeded, the subscribed MCE is notified. Aggregation is carried out during the information collection phase, in order to minimize overhead. More details on the IA facility of the VIS and relevant optimization aspects can be found in papers [59] and [60].
Accordingly, the KP component handles and produces global picture information. This type of information is produced from aggregated information and/or by combining processed information coming from several parts of the network. In both cases, reasoning and inference mechanisms and associated software components are required. These components may use a number of algorithms depending on the exact problem that is addressed, the type of inputs that are used and the type of output that needs to be acquired. Such techniques come from scientific areas like statistics, clustering, reasoning, fuzzy or machine learning (including supervised, unsupervised and reinforcement learning techniques). The produced knowledge from the IPKP function can be optionally stored through the ISI function so as to be available for other MCEs when it is needed.
In Figure 4, we show example interactions involving the IPKP function. An IA or KP process is triggered from the ICD function if this is required by a newly arrived MCE registration. The IA and KP operations are optimized: (i) from a high-level viewpoint, through updates to the IA and KP algorithms, guided by a corresponding high-level application and based on global performance/accuracy measurements as an input, and (ii) from a lower-level viewpoint, guided by the IFEO function and having as an input the MCE registration information, the global performance goals in the system and the current status of the topology.
In practice, the high-level view is responsible for the general behaviour of the IA and KP processes and the IFEO function for the efficient adaptation to local demands. Such optimizations include efficient placement of aggregation points, information filtering based on accuracy objectives, and other information flow configuration aspects. Details of similar experimental evaluation can be found in papers [59] and [60]. The main IPKP operations are being summarized as follows.
Information aggregation This applies aggregation functions to the collected data/information. The aggregation process increases the level of information abstraction, thereby transforming the data Figure 5. Overview of the knowledge production operation. IPKP, information processing and knowledge production into a structured form, but at the same time reducing the load on the network. Aggregation works in situations where MCEs do not need a continuous stream of data from the VIS, but can get by with an approximation of the data values. For example, obtaining an occasional measurement with the average volume of traffic on a network link may be enough for some MCEs. Some common aggregation functions include SUM, AVERAGE, STDDEV, MIN and MAX. Although these are most common, arbitrary functions can be passed in, which gives considerable flexibility when determining aggregations. The IA operation uses information filtering based on accuracy objectives [59], [60].
Knowledge production This produces higher level global picture information through processing and/or aggregating information. In both cases, reasoning and inference mechanisms and associated software components are required. Such techniques include statistical methods, clustering, reasoning, fuzzy or machine learning (e.g. supervised, unsupervised and reinforcement learning techniques). The necessary input information can be available in storage or can be produced in real time, using an information collection operation. We have investigated the issue of KP in the context of the UniverSELF project [61], using a number of alternative KP problems and solutions as an input and targeting a generalized KP facility for VIS. We demonstrated two solutions involving such generalized KP capabilities: [62] and [63]. This is a very complicated aspect that deserves an independent study, which we consider as a future work. As an example, all of the required information that enables the VIS KP capabilities should be described in a relevant ontology, ready to be looked up from the IPKP function when such a demand appears (including the problems addressed, the types of inputs/outputs and the inference/reasoning mechanisms used). A first definition of a relevant ontology has been released in [61].
In Figure 5, we show an overview of the designed KP operation. The basic interactions involving the main IPKP components are described here. Consider that an MCE that requires the VIS IPKP functionalities requests to utilize either an IA or KP operation. The ICD function handles the communication of the MCE with the internal IPKP functionalities, and the IPKP controller is responsible to control the internal IPKP components. The two IPKP operations (namely, IA and KP) require a number of fundamental steps, which are outlined as follows: Step 1: Determining the IA or KP parameters (e.g. information filtering configuration, the inference/reasoning algorithm to use, translation requirements, whether aggregation is required and/or information/knowledge post-processing requirements). This process is handled from the IPKP controller, which matches the MCE's requirements and the type of problem to solve with the relevant information. The parameters are being communicated to all relevant internal IPKP components.
Step 2: Collection of input information either from an MCE that produces it or from the ISI function (i.e. the VIS storage). A collection request is being passed back from the IPKP controller to the ICD function.
Step 3: Pre-processing of the input information (e.g. applying information filtering) that may be required. The pre-processing requirements are being set from the IPKP controller. More details can be found in [59] and [60].
Step 4: The input information is being passed to the IA operation in case of IA, where an aggregation process takes place according to the requirements (e.g. aggregation function used) being set from the IPKP controller. In case of KP, this step may be bypassed or not (i.e. the KP processes may require aggregation before the inference/reasoning process).
Step 5: In case of KP, the input information may need to be translated into a convenient representation, such as the Web Ontology Language (OWL) [64]. The translation configuration is being set from the IPKP controller to match the requirements of the inference/reasoning mechanism identified from a relevant ontology, such as the one described in [61].
Step 6: The actual inference/reasoning process takes place in this step. The input information (i.e. in an appropriate form) and the relevant KP rules are being passed to the identified inference/reasoning mechanism. A rule description language that can be used is the Semantic Web Rule Language [65]. The output of this process is the produced knowledge. This step may be bypassed, in case of a request for IA without KP.
Step 7: The produced knowledge or aggregated information may need a post-processing (e.g. filtering). This step is optional.
Step 8: At this stage, the result is being communicated to the ICD function, to find its way to the requesting MCE. The produced knowledge or aggregated information can be optionally stored in the ISI function so as to be available for MCEs when requested/needed.
A relevant KP workflow is described in [61].

The information flow establishment and optimization function
The IFEO function regulates the management information flows based on the current state and locations of the participating components (e.g. the MCEs producing or requiring information). In particular, it controls how the information collection and dissemination are handled from the ICD function (e.g. communication method and data path used), the information aggregation in the IA operation (including optimal aggregation point placement) and storage optimization configuration parameters in the ISI function. Furthermore, it guides a filtering system for information collection and aggregation points that can significantly reduce the communication overhead. Both information dissemination and collection processes should meet certain information collection/dissemination constraints, being communicated to the VIS during the MCE registration processes. For example, a number of MCEs may trade information accuracy for communication cost. Such accuracy objectives should also meet global performance requirements (i.e. harmonizing management information flows to the global performance goals). The IFEO function is responsible for such quality enforcing functionalities.
In the IFEO function, a number of negotiations take place that match interests with constraints. The outcome of each negotiation is the management information flow parameters, balancing the capabilities of the information sources with the requirements of the information sinks and the potential global performance goals in the system. The IFEO function enforces optimization decisions by communicating with the corresponding MCEs and the VIS functions, in order to satisfy the performance optimization requirements. The aforementioned processes are part of the quality enforcement functionality of the VIS, and all corresponding decisions are being taken from the information quality controller operation of the IFEO function.
The basic IFEO operations are detailed as follows.
Information flow establishment and optimization This is responsible for the establishment and optimization of each management information flow. Flow establishment takes place either proactively (during an MCE registration) or reactively (during an MCE configuration update phase). The IFEO oversees the negotiation activity between potential information sources and sinks. The outcome is one or more flows parameterized appropriately to meet the particular optimization requirements. The management information flow constraints/requirements come from the MCEs during their MCE registration/configuration update phases. The constraints/requirements are aligned with the high-level performance objectives of the system. This operation applies the optimization decisions coming from the information quality controller operation, described as follows.
Information quality control This operation is responsible for taking the management information flow optimization decisions. The information quality controller requires sophisticated heuristics that match requirements with constraints (i.e. local optimization) with respect to the global requirements in the system (i.e. global optimization). Different prioritization levels are set in these two optimization aspects, in case of conflicting goals. An MCE may participate in a number of co-dependent management information flows. So a change in the network environment or the requirements may trigger a re-establishment of a group or all existing management information flows. The IFEO function enforces the optimization decisions by communicating with the corresponding MCEs and the VIS functions, in order to satisfy the performance optimization requirements.
In Figure 6, we show how the IFEO function acts as the heart of VIS efficient operation, including optimizing: (i) the deployed/overseen management information flows through the ICD function; (ii) the storage and indexing capabilities through the ISI function; and (iii) the IA and KP features through the IPKP function. All optimization aspects consider the global performance goals in the system, which come from corresponding high-level management applications. These management applications may trigger changes in all optimization algorithms, based on their specialized but global performance measurements (storage efficiency, communication cost, accuracy of processed information, etc.).

EXPERIMENTAL EVALUATION OF THE MAIN VIRTUAL INFRASTRUCTURE INFORMATION SERVICE OPERATIONS
The virtual infrastructure information service as an open-source solution is available at [19]. In this section, we present an evaluation exercise of the main operations of the VIS. First, we discuss our experimental methodology; then we give our experimental results in terms of flexibility and adaptability of the information exchange process. In order to experiment with a complete virtualized environment, we built a test bed that integrated our VIS implementation with the VLSP platform [23]. In paper [20], we provide the VIS implementation details and a functional extensive experimental analysis in an NFV MANO [21] context. The VLSP is our unified approach to NFV and SDN, which is also available as an open-source solution at [66].
Very lightweight software-driven platform supports (i) software-defined virtual network topologies, and their associated management, and (ii) efficient service deployment and operation with the relevant management aspects, rather than just flow management. The VLSP collects measurement data using the Lattice monitoring system, presented in [57], and optimizes distributed node placement using the algorithms proposed in [60]. The resource allocation mechanism used in VLSP was presented and evaluated in [67]. The same paper describes an early implementation of some components used in the VIM and LNH layers. The VIS, VLSP and Lattice components can be all downloaded from [68] where more details of the components can be found.

Experimental methodology
Every experiment starts with the creation of a virtual network topology over a physical network of 11 servers with four CPU cores and 8-32 GB of physical memory each. The network consists of 100 virtual routers and a number of randomly created virtual links. The connected routers are chosen randomly, using the Barabasi-Albert (BA) preferential attachment model [69], which explains some features of the real Internet topology. Each new virtual router is assigned dynamically from the VLSP to the physical machine with the least processing load. We have investigated more sophisticated resource allocation algorithms, results of which can be found in [67].
We implemented MCEs as simple management applications, with diverse information management requirements for testing purposes. All these applications support three communication methods (namely, push/pull, pub/sub and direct communication), and each can specify and update their own requirements at any point of the communication. This is carried out by changing the requested communication method, the local performance goal or the minimum/maximum data rates. By using the dynamic node selection algorithms presented in [60] and having as input the global network view, the output of the negotiation determines the most appropriate data paths for the management information flows.
The MCEs periodically transmit performance measurements to the VIS over the established management information flows using the following metrics: Average response time: The average time taken from the request of a piece of information from a sink to the point that it is received. Information freshness: The time taken from the production of the new information to the point it reaches the requesting MCE. This is one way to quantify the 'quality of information'. Average CPU load: Reflecting the average CPU load value associated with the VIS software. This allows us to monitor the VIS behaviour, in terms of processing requirements. Total memory storage used: The total memory storage used in the VIS. The data for this metric come directly from the internal data structures and the chosen database technology (the redis NoSQL database [70]).
The average values of all the aforementioned metrics are calculated every 10 s in a separate metric collection aggregator. Because of the stochastic nature of our experiments, we evaluated the statistical accuracy of our results using 10 runs and calculated the average values for each measurement. This number of runs was deemed appropriate in order to produce very low standard deviations.
We deployed distributed VIS nodes using the PressureTime placement algorithm [60], whereby the number of VIS nodes is based on the topology size. After that, the VLSP assigns all of the MCEs to the most appropriate VIS node, that is, the closest to them. Finally, after a small warm-up period, the communication starts.

Evaluation results
In order to validate the main VIS capabilities, we have defined the following two scenarios: Scenario 1 -VIS adaptability: demonstrates how the VIS adapts to different conditions in terms of MCE requirements and management information flows number. Scenario 2 -VIS flexibility: highlights how the VIS supports concurrent diverse needs, while serving a global performance goal.
The results of these two scenarios are presented.

Scenario 1 -virtual infrastructure information service adaptability
For this first scenario, we experimentally explore the adaptability properties of the VIS, given the diverse network environment conditions and the varying MCEs' requirements and constraints. We used a topology of 100 virtual routers, while the number of information flows ranged from 5 to 30. The scenario uses up to 60% of the routers as sources and sinks for network management and control information and an additional number of routers for the distributed VIS nodes, thus matching a wide range of realistic distributed management and control software deployments, in terms of flow numbers. We executed the experiments with the different communication methods outlined in Section 3. The communication method is one of the many information flow configuration options that could be demonstrated here (i.e. Subsection 3.1.4 discusses other information flow optimization aspects).
According to Figure 7(a), which shows CPU load, the VIS accommodates a number of flows well, based on resource availability. We use the pull from entities method in this example, but a similar behaviour was noticed for other methods as well. There is a minor increase in the processing load of VIS, as the number of management information flows increases. However, this increase is stable and predictable. According the Figure 7(b), the average response time shows a minor increase as the number of management information flows increases. Here, the response time may exhibit a minor jitter, in the range of milliseconds, that can increase with management information flows contention. We have determined that many of these spikes occur because of task and thread switching and other low-level OS processes that may run in the servers, and are not VIS attributes. These spikes will not happen with dedicated SDN communication devices that use separate network processors. Furthermore, fully distributed methods do not have this issue (Figure 7(c)). As the involvement duration of the VIS (including the centralized storage behind it) is gradually reduced, the jitter is reduced as well.

Scenario 2 -virtual infrastructure information service flexibility
In this scenario, we created a topology of 100 virtual routers with random virtual links. Using this topology, we experimented with 5, 10, 15 and 20 management information flows. Each time, 20% of the management information flows (i.e. the selected flows) were configured to use the direct communication method, where the rest were using the pull from storage. At some point in time, the application associated with the direct communication MCEs requests a change in its own requirements-this request triggers a renegotiation of the corresponding flows. The new requirement is for an improvement in the average response time. The renegotiation process ends up converting the direct communication flows into pull from storage flows-communicating through the VIS. Using this scenario, we demonstrate how an application may change its requirements in terms of information manipulation and how VIS responds to that, in a way that is aligned to its own global performance goal requirements, in other words, how the local optimization with the global optimization aspects is being balanced. This strategy can be associated with a control loop that detects and tackles performance problems. We plan to introduce such a management capability in the near future.
In the beginning of the experiment, we have eight flows communicating via the VIS (using pull from storage method) and two flows communicating directly. After the 17th second, the latter two flows request a change in their communicating method, which is granted from VIS. After approximately 1 more second, the two flows have been re-established and reconfigured to use the pull from storage method. In Figure 8(a), we see a snapshot of how the type of flows changes with time, showing a total of 10 flows. As is shown in Figure 8(b) and (c), the VIS accommodates the different number of management information flows, with a slight increase in processing load and memory consumption per flow. This is an insignificant impact of this process on the CPU load and memory state of the VIS. Information Freshness

Average Information Freshness
The observed impact on the performance of the two flows and the global performance is in Figure 9(a) and (b). We see that a trade-off between response time and information freshness is tuned in a way that satisfies the application using the selected flows. There is a reduction in the average response time after the flow re-establishment but with an increase in information freshness. Regarding the global view (Figure 10(a) and (b)), there is no statistical important impact in the average response time. However, the average information freshness increased, because of the higher information freshness of the selected flows.
As a bottom line, VIS is able to orchestrate the management information flows and balance local with global requirements and grant requests to tune performance trade-offs in a way that satisfies local applications without damaging the global performance.

CONCLUSIONS
In this paper, we have identified four research challenges for the future evolution of SDN/NFV technologies towards improving service awareness and information management in virtualized highly dynamic environments. As an important step towards tackling the identified open research issues, we propose the introduction of an abstracted information manipulation facility that is both adaptable to the requirements of the diverse management applications, as well as the constraints of the underlying resources.
Additionally, we have introduced and highlighted the design details and the characteristics of VIS. VIS is a new information management and orchestration infrastructure and service for highly dynamic virtualized environments, such as SDN-enabled and NFV-enabled virtual infrastructure and network clouds. The VIS is the first proposal, to our knowledge, that resides between the applications and services and the network infrastructure, and is able to orchestrate the management information flows, while balancing the global requirements with the local requirements. We have provided a proofof-concept platform demonstrating these challenging features. Evaluation results demonstrating the behaviour of the main VIS operations are also included in the paper.
For future work, we consider the following: to thoroughly experiment and improve the negotiation heuristics of management information flows in order to support a wider range of parameters, considering a selection of important management problems; to investigate other local and global performance trade-offs that can be tuned, such as between energy efficiency and quality of service; to improve our facility towards the better support of service chaining capabilities (such as those discussed in [8] and [71]); to investigate a number of optimization strategies and associate them with different high-level performance goals; to determine how VIS behaves in dynamic environments and when considering trade-offs being associated with the information flow negotiation complexity and delay-sensitive management applications; to observe the impact of resource allocation algorithms for different types of virtual resources, allowing us to reach even larger scales; to evaluate complete autonomic control loops, at both local and global levels, for addressing performance and stability problems; to consider aspects from the information-centric networks paradigm [4], by having data applications that can communicate over negotiated flows, while the global behaviour of the system will be monitored and controlled in a logically centralized manner.