Formally specifying and checking policies and anomalies in service function chaining

One of the proposed management strategies for SDN networks is to specify tra(cid:30)c forwarding through policies, where each policy rule identi(cid:28)es a tra(cid:30)c (cid:29)ow and its traversed service chains. While network operators need to check network con(cid:28)gurations as soon as possible, the SDN veri(cid:28)cation literature focuses on checking policy correctness during or after their deployment. This paper, instead, proposes early veri(cid:28)cation of forwarding policies before their deployment, by looking for the presence of anomalies that can potentially lead to erroneous and unexpected network behaviour. The proposed veri(cid:28)cation relies on a formal model that enables high (cid:29)exibility in specifying both a forwarding policy and the set of anomalies to verify. The presented approach is e(cid:30)cient and highly scalable, as con(cid:28)rmed by tests with large networks.


Introduction
A recent innovation in networking is the Service Function Chaining (SFC) concept [1], which consists in instantiating an ordered sequence of network functions, and consequently steering a particular portion of packets (e.g. the ones of a particular user) through the deployed chain.
However, SFC services have introduced additional complexity and many challenges in ow management, addressed with the introduction of Software Dened Networking (SDN) [2], which centralises the network management logic into a single programmable Controller.
Network operators can simply use one of the existing network programming languages to program the SDN controller and dictate the forwarding behaviour of the network at run-time.Those languages (e.g., Flow-based Management Language (FML), Frenetic and Merlin) provide Even if some of the aforementioned programming languages perform their own validity checks before translating policy specications into OpenFlow entries, these checks depend on the language adopted to program the network.
Instead, in order to achieve best network reliability and security, the administrator should have a uniform checking mechanism, independent of the adopted controller language.
In addition, so far the literature has proposed many OpenFlow-oriented verication tools (e.g., [3,4,5,6]) to check the violation of network invariants in the output of the forwarding policy translation (i.e., in the Open-Flow switches congurations).However, these tools detect problems during or after the switch congurations deployment.Instead, an earlier detection, done during the policy specication phase, would have two advantages.The rst one is that, in case of error detection, error xing is faster, because the xing phase can start earlier, without even having to start the deployment phase.The second one is that, in case of error, the computational and storage resources necessary for translating the anomalous policy rules and for deployment are not wasted as otherwise happens.
An early verication is fundamental, especially in the new smart and IoT environments managed through SDN, that are becoming essential elements in the industrial network systems (INSs) with the advent of Industry 4.0 and Factory of the Future paradigms [7].In these systems, where security and safety are strictly interdependent and productivity is one of the main goals, the introduction of mechanisms for detection of unexpected behaviours before the deployment phase would improve the security and safety of the systems.Moreover, it can also avoid the waste of computational resources from the side of the controller, due to translation and storage of erroneous inputs.ow entries.However, the highest-priority entry may not be the most suitable one for managing that particular trac ow, which can lead to anomalous behaviour.In this paper, we mainly aim at enabling early error detection on forwarding policies, relying on a formal modelling approach.In practise, a precise and unambiguous meaning is given to a forwarding policy specication, independently of the adopted programming language and of its level of abstraction.A Forwarding policy will thus be expressed by means of a single formalism that embraces the variety of abstractions oered by the existing SDN programming languages.
By anomaly, we mean any erroneous or unwanted policy specication (e.g.including errors, conicts or suboptimizations), which may be due to e.g.human errors, and that may cause misleading network conditions and states.
An example of anomaly is the violation of an operator-dened constraint of the SFC (e.g.network function ordering) or a conict in the forwarding specications.
We assume the correctness of the translation algorithm that generates OpenFlow rules from policy rules, because it is generally implemented as an automatic process and thus we leave its verication out of scope.
Even if we eliminate anomalous policy specications, other errors in network forwarding may still be present at run-time, due to wrong congurations installed into the network functions (e.g., wrong ltering rules installed in rewalls).Errors of this kind can be detected and solved by means of other approaches, such as the ones proposed in [8,9,10], that use complex network models and re-quire more time-consuming verication algorithms.For this reason, the approach we are proposing does not substitute other more complex and accurate analyses, but it aims at early and fast detection of a number of anomalies already in the policy specication phase.
Another contribution of our approach is the possibility to dene and verify custom anomalies specied by the operator, in addition to a set of pre-dened anomalies corresponding to general mistakes to be avoided in any network.This high exibility in dening the anomalies to check is desirable because it enables the customisation of the verication process.
In our view, both custom and pre-dened anomalies can be specied using the same formalism, thanks to a set of operators that let one precisely and unambiguously specify the meaning of each anomaly.The anomalies specied by means of these operators are automatically translated into formulas in First Order Logic (FOL) that are nally fed to the verier along with a policy to be checked.
In this way, the user is not exposed to the complexity of FOL.
In this paper, we also propose a possible pre-dened set of anomalies to be detected.Such set includes novel anomaly classes proper of the SFC domain in addition to those classes of anomalies that lead to errors in the derived OpenFlow congurations and that have already been studied in the literature [11].
The remainder of this paper is organized as follows: Section 2 presents the current state of the art; Section 3 summarizes the problem statement and contributions of this work; Sections 4, 5, 6 respectively describe the structure of a forwarding policy, the supported operators for specifying anomalies and the anomaly detection model.
We have also implemented an anomaly detection process, in order to evaluate the time required to verify policies for a whole network (Section 7).Finally, Section 8 concludes the paper and presents some possible future works.

Background
The most relevant works related to our approach can be divided into three categories, i.e.SDN verication, SDN programming languages and network policy analysis.
SDN verication.Similarly, Anteater [13] veries such invariants by expressing them as boolean satisability (SAT) problem instances while NetPlumber [4] relies on Header Space Analysis (HSA) in order to detect forwarding loops and leakage problems.
The real-time approach consists of placing the verication tool as a layer between SDN Controller and network switches.This is the case of VeriFlow [3], which dynamically checks if the absence of forwarding loops and black holes is satised at each OpenFlow rule insertion.
The main limitation of such o-line and real-time tools compared to what we are proposing here is that they do not perform an early detection of errors and faults.
SDN Programming languages.The literature presents a variety of SDN programming languages.Even though they do not focus on our main aim of checking network correctness, they share with our work the need to specify a forwarding policy and they also provide some form of checking on the policy that can be specied.For this reason, we analysed their variety in modelling packet forwarding in order to dene a verication model exible enough to full the network operators' needs, while the checks they provide on policies can be considered as a basis for dening a set of pre-dened policy anomalies.
They let the user identify trac ows by means of a lowlevel abstraction, i.e. predicates over standard OpenFlow headers (e.g.IP address, VLAN id, etc.) and operators like union and intersection applied on those predicates.
The same level of abstraction is oered by Pyretic [16], which allows sequential and parallel policy compositions in addition to what is oered by Frenetic and NetCore.
A similar approach is also adopted by Merlin [17], also Hence, the use of a verication mechanism specic for the adopted language limits the set of errors and faults that can be detected.
Some of these languages also cover additional problems in network operation, like fault-tolerance for FatTire and bandwidth allocation for Merlin.On the contrary, we do not claim to cover such range of problems, but we only aim at detecting anomalies in forwarding policy specications.
Network Policy analysis.The literature focuses mostly on detecting redundancies and conicts among rules that make up a policy in several domains.Among such domains, the most relevant ones for our work are the ltering and the OpenFlow ones.In such domains, a conict is generally seen as a faulty network state derived by two conguration rules (e.g.rewall rules and switch ow entries) that overlap and have dierent, conicting actions.Redundancy of policy rules, instead, is generally considered a kind of sub-optimization in policy specication.
Part of our model has been inspired by the previous works on conict analysis over such domains (e.g.[18], [19], [11]), but we have reinterpreted and extended them to be applied to the SFC domain, not only for conicts or redundancy, and to be used also with a high-level modelling formalism.
In the ltering domain, Al-Sharer et al. [18] have proposed a tree-based representation of rewall congurations to check anomalies among ltering policies.In particular, the underlying formal model is able to detect an anomaly between two rules by checking which relationship exists between them.A similar approach has been proposed by Cuppens et al. [20], who have included the analysis of NID congurations.The main limitation of these works is that checking only relationships among rules limits the set of anomalies that can be detected, while we envision a model exible enough to enable operators to dene their own anomalies, in addition to a pre-dened set of anomalies to be checked in every network.
Other solutions have a similar limitation, like Liu et al. which focuses on building anomaly-and redundancyfree rewall congurations [21].Such solutions check the relationships between rules two by two, while we envision a complete analysis over the whole set of rules that can detect anomalies triggered by one, two or many rules.
Regarding the OpenFlow domain, in addition to the fact that such proposals (e.g.[11], [22]) can perform a late SDN network analysis, most of them search for conicts in OpenFlow congurations only, and overlook other kinds of misconguration (e.g.network function ordering).In this direction, an interesting and promising work was proposed by Prakash et al. [23].Through a Policy Graph Abstraction (PGA) to express OpenFlow policies and an algorithm to automatically compose such policy graphs, the authors are able to determine an appropriate service order and to resolve policy conicts, by minimizing operator interventions.

Problem Statement and proposed solution
As claimed above, all the aforementioned policy-oriented programming languages oer only some of static checks, and they miss an underlying formal model for both policy and anomaly specication.This means that an administrator can check only a limited and xed set of errors and faults over the chosen forwarding policy rules, in case one of the existing SDN programming languages is used.
Moreover, these checks are not based on mathematically rigorous models.
Our aim is, thus, to oer administrators, on one side, a single mechanism that can check a richer set of anomalies in their forwarding policy rules and, on the other side, a mechanism to dene their own anomalies to check.A mathematical foundation is given to these mechanisms by dening a formal language to specify a forwarding policy and a FOL-based model to specify custom anomalies and to detect their presence in the policy rules that will be enforced in the network, can be performed.
In particular, our formal language and model rely on a middle-level of abstraction between the high-level representation adopted by the existing SDN programming languages and the low-level rules installed into the network switches (e.g.OpenFlow rules).In this way, both topdown and bottom-up anomaly analysis.
In the former, administrators specify the preferred forwarding policy by using one of the SDN programming languages; such policy will be translated into our formalism to be checked by our anomaly model.In case no anomalies are identied, policy rules can be translated into the lowlevel congurations that will be installed into the network.
The bottom-up analysis, instead, may be applied in case an administrator changes manually the network con-guration (e.g. by adding a new OpenFlow rule in a switch FlowTable), willing to make sure of the new network conguration correctness.Thus, starting from the low-level rules installed in the network nodes, these are mapped into our policy model which is analysed for checking the presence of anomalies.The literature has proposed many OpenFlow-oriented verication tools that can detect the presence of anomalies introduced by switches congurations updates.However, these tools can follow only the bottom-up approach and cannot perform a top-down analysis.
In our view, an anomaly represents any erroneous and undesired condition that an administrator wants to detect and eliminate in a forwarding policy, in order to guarantee a self-consistent policy and avoid some trac forwarding errors in the network at run-time.We are targeting not only conicts among policy rules but also, for example, anomalies triggered by a single rule, such as the violation of an ordering constraint in the sequence of functions specied by a single rule.For example, a network operator wants to ensure that a NAT is always congured to process trac before a rewall.This means that we have to detect the anomalous situation when a NAT is located after a rewall in the SFC topology.Another example is when an administrator wants to speed up web services response by making sure all web trac between users and their servers traverses a web cache.
In summary, we propose a verication approach that is: (i) independent from the overhead language adopted to program the network; (ii) exible enough to cover a large set of anomalies (i.e.non only conicts); (iii) general enough to enable the use of the dierent levels of abstraction allowed by the aforementioned languages.Our model, in fact, can be integrated into any SDN Controller programmable with a policy-oriented language.The only thing network operators need is the addition of a languagespecic module to map the forwarding policy from the adopted language into our model or vice-versa.However in this paper we are mainly interested in presenting the details of the model and its features, while the design of translation modules is left as future work.In the next sections, we present our formal model for both policy and anomalies in more detail, and we provide also a rich set of anomaly examples.

Forwarding policy model
In our model, a forwarding policy (R F ) is a set of forwarding rules (or simply rules), each one putting in relation trac ows with the SFCs those ows can traverse at run-time.A generic forwarding rule r in a forwarding policy (r ∈ R F ) has the following structure: where: • M is the trac ow managed by the rule, which belongs to the set of all possible ows in a network (M ∈ M); • C is the set of SFCs that M can potentially traverse at run-time and it is a subset of the whole set of chains instantiated in the network (C ⊆ C); • P is the set of Properties associated to the ow M and to the set of SFCs C that M can potentially traverse (P ⊆ P).
Note that in our model we suppose that not necessarily all the packets of a ow M traverse all the congured chains at run-time.In a real scenario, packet forwarding, in fact, depends also on network function conguration and state, thus a ow can be forwarded to zero, one, many or all of the allowed chains, and individual packets belonging to a ow can traverse dierent chains.As an example, let us consider a ow M dened as all web trac with a given source address, and let us assume we want this ow to be allowed to reach only a collection of web servers, all behind a load balancer, which selects at runtime the destination

{ }
set web server for each packet, based on its internal algorithm and state.In this case, we can write a policy that associates M with a set of chains, each one including the load balancer and one of the destination web servers.In other cases, the packets of a ow can even traverse more than one chain at a time.This happens, for example, with a mirroring function that replicates the incoming ow onto dierent outgoing chains.
Note also that the proposed model does not consider rule priorities as instead it has been done in the OpenFlow domain ( [11], [22]).This is because we are working at a higher abstraction level, where we loose the notion of order among forwarding rules.It is only when a forwarding policy is translated into Open-Flow ow entries that we need a priority in FlowTables.
Another reason for omitting priorities is also that each forwarding rule species all the allowed chains for a set of ows.For this reason, in order to avoid ambiguity in a policy, forwarding rules should be specied with nonoverlapping trac ows.When this condition is violated, we have an anomaly in the policy according to our model.
Since network operators should not be limited to use one particular SDN programming language (e.g.FML, Merlin, Pyretic) and each language has its own formalism and abstraction level, our model has been designed with two levels of abstraction for specifying trac ows.
Generally, a ow M is modelled by referring to a set of network elds.A network eld n is an element of N (n ∈ N ) and the denition of N varies based on the level of abstraction we adopt.In particular to model M, we rely either on a set N H of packet header elds (i.e.headerbased representation) or on a set N N of high-level names (i.e.name-based representation).A name is a label dened by network operators to represent elements in their networks, such as hosts, network functions, trac types, VLANs and subnets.A set of names can thus be used to identify a particular trac ow, for example the one of an SSH connection from a particular user to a particular external subnet.
The verication workow of our model consists of receiving the forwarding policy (expressed in one of the two formalisms), performing the verication step, and reporting the anomalies detected, otherwise continuing with the SFC deployment.An additional step is performed in case policy rules are expressed in the name-based abstraction.
In this case, after the rst verication step, the high-level policy is translated into the corresponding header-based representation and its correctness is checked again.This two-fold check enables high exibility in the formalism to adopt, along with a more complete and early anomaly detection.
In the next sub-sections we present the details of how trac ows are modelled and how the name-based representation can be mapped onto the header-based one.

Name-based representation
In the name-based representation, N N can be dened as follows: where each eld n n of this set has a dened meaning.For example, in (2), the elds indicate respectively the sender, receiver and trac type that characterize a trac ow, but this set can be extended as needed to include more elds.
Generally, each network eld n n has a type, i.e. the set of values that can be taken by the eld.A value v has to be specied for each eld n n of N in order to identify a trac ow.In addition to specifying single values, it is also possible to specify sets of values or even any value, which is represented by the special symbol * .Hence, a ow M N in the name-based representation (M N ∈ M N ) is formally dened by a set of equalities, one for each n n : Examples of values for tr f _t ype are single values like tcp", udp", ftp", ssh" or sets of names like {http, https}" 3 , while usr_src and usr_dst can indicate for example users, hosts, subnets, VLANs, etc... (e.g., User1 ", Department1 " and Turin").

Header-based representation
We also provide another way to model a trac ow M, where network elds refer to standard packet header elds.
In particular, our header-based representation of a ow Our model supports the translation of the name-based representation into the header-based one.We suppose this process is performed by an additional entity, named policy engine, similar to the one provided at runtime by programming languages based on names (e.g.FML).
3 tr f _t y pe can be initialized with any other name of well-known protocols.
Our policy engine uses a knowledge base K ⊆ K that is a set of mappings from high-level names to corresponding low-level values.Formally, K is a set of entries k, each one being a set of name-value pairs, where the value paired with "name" is the high-level name mapped by the entry and the other values are the corresponding low-level values.
The entries k ∈ K can include dierent low-level values according to the type of high-level name they map.
The algorithm used by the policy engine to map a name-based ow specication M N ∈ M N into its corresponding header-based M H ∈ M H using a knowledge base K ⊆ K is formally represented by a function Ψ : This function translates each eld of M N into one or more elds of M H .Note that the header-based elds into which each name-based eld is translated can depend on the knowledge base.In our specic setting, they depend on the types (client or server) of the end users involved.

SFC representation
As specied in (1), a forwarding rule also includes the service chains (i.e., SFCs) C that can be traversed by the ow M. In detail, C is the set of chains c enforced by a rule, which is, in turn, a sub-set of all possible chains C: Each function f w k in a chain c k is one of the functions present in the network and it is modelled by the pair: where f _id w k is the function identier and f _t ype w k is the function type, which necessarily has to belong to the catalogue of network functions (F) oered by the opera-tor 4 .Thus we model a network chain as: This approach oers a level of detail in modelling SFCs higher than the one oered by existing formalisms.The SDN programming languages that explicitly manage service chains (e.g.Merlin, FatTire) generally indicate just the types of functions, without being able to consider their real instances deployed into the network.Thanks to our approach, instead, network operators can describe their networks more precisely and perform more accurate checks.
To summarise, given a forwarding rule r, its name-based representation is: while its header-based representation is: of properties, such us Bandwidth and VNF node computational power, have been listed in [24,25].
P is thus formally dened by a set of equalities between a property p and its numerical value v.
We considered the value v p as a minimum value.This means that for the trac ow M and for the C at least the value v p has to be allocated for the system resource indicated by p.
It is also possible to specify any value for a property, by means of the symbol * (i.e., no requirements about minimum quantity of resources are specied).
The next sections rst introduce the relational operators that can be used for building anomaly specications.
Such operators enable pairwise comparisons between the elements that compose a forwarding rule or that belong to dierent rules.We then introduce the anomaly model, which is a FOL formula that involves a set of pairwise comparisons.
To help the readers, from now on, in this paper we indicate the elements of a forwarding rule r i as follows: • M is the trac ow specication, regardless if its being name-based or header-based (i.e.M N and M H ); • M i , C i and P i are respectively the ow, the SFCs, and the properties dictated by rule r i ; • n i is a generic network eld of rule r i ; • n g i is the g-th network eld in r i and v g i is its value; • c k i species the k-th chain in the i-th rule; We use this notation in case we are referring to dierent forwarding rules (e.g., r i and r j ), while in case we are indicating a single rule, we do not use any index as subscript to indicate the rule itself r and its elements.

Relational operators for anomaly specication
In order to enable the specication of anomalies, the model oers a set of relational operators.These operators enable the specication of pairwise comparisons (x ∈ X), each one involving network elds, SFCs and proprieties belonging to the same or to dierent rules.Formally, these comparisons are predicates that let us nally identify sets of matching forwarding rules.More precisely, if x is a comparison that involves elds and SFCs belonging to the same generic rule r, x can be regarded as a function of r which returns the result (true or false) of the comparison evaluated on r.Moreover, x identies the set of rules r such that x(r) is true.If instead x involves elds and SFCs belonging to two dierent rules r i , r j , then x can be regarded as a function of two variables r i , r j which returns the result (true or false) of the comparison evaluated on r i and r j .Moreover, x in this case identies the set of pairs of rules (r i , r j ) such that x(r i , r j ) is true.
Note that for simplicity, in the following subsection, we do not describe the operators used to evaluate properties.
This is because we use the same operators used for numeric values (i.e., =, , >, <).

Network eld operators
We and it is included by the Layer 3 protocol IP.For what concerns instead usr_src and usr_dst, we suppose that a user name can be associated to a host name, which in turn belongs to a subnet: for example, the user Alice" is associated to the host name HostA" that is included in the sub-net Department1 ".These relations that bind names have to be specied by the operator and are added to the knowledge base of the policy engine, so that the verier can take them into account.
The supported operators are listed below.
• equivalence (=): two elds are equivalent if the value(s) they can take (in the ows of the rules they belong to) are the same, even though they are dierent elds.For example, if M H includes port_src = Of course, equivalence can also be applied to elds belonging to the ows of dierent rules; • dominance ( ): Also, the above operators can be applied not only to compare two elds but also elds and single values or elds and sets of values with the obvious meaning.
Moreover, combinations of operators are allowed.This is the case of equivalence or dominance ( -i.e., a network eld is equivalent to or dominates another one), equivalence or correlation ( -i.e., a network eld is equivalent or correlated to another one) and equivalence or majority (≥ -i.e., a network eld is equivalent to or greater than another eld).

SFC operators
As already mentioned, here we dene new operators, in order to enable comparisons that involve SFCs.First, we introduce the following notation to represent ordered sequences (i.e., SFCs) and unordered sets of network functions: • ordered sequence ([]): this notation was already introduced for the specication of SFCs inside forwarding rules.It is also used to represent ordered sequences of network functions in an anomaly specication.Via the wild card character *, the proposed model supports the specication of unidentied functions, i.e. functions for which only the type is specied, not the identity.For example, a chain composed of a NAT followed by a rewall can be specied generically as [< * , N AT >, < * , FW >]; • set ({}): this notation can be used to specify unordered collections of functions.For example, a chain including an application rewall (L7_FW ) and a DPI, not necessarily in this order, can be specied as {< * , L7_FW >, < * , DPI >}.
For what concerns the comparison between SFCs that can belong to the same forwarding rule or to dierent rules, we extend the current literature by enabling pairwise comparisons between: (i) two chains of either the same or dierent rules; (ii) a chain and an ordered sequence of functions (i.e., a chain not managed by a forwarding rule); (iii) a chain and a set of functions.In some cases, the same operators can be used for dierent types of comparisons, the exact meaning of the comparison being determined by the types of the compared elements.
In case of comparison between two chains (of the same or of dierent rules -e.g., c k and c l ) or a chain and an ordered sequence (e.g., c k and [ f 1 , f 2 , ..]), the following operators can be used: • equivalence (=): two chains are equivalent if they are the same ordered sequence of network functions (e.g., if • dominance ( ): a chain dominates another one when it contains the second chain as a subsequence and the two chains are not equivalent (e.g., if • correlation (∼): two chains are correlated if none dominates the other, but they share an ordered subchain (e.g., if • disjointness (⊥): two chains are disjoint if they do not have any sub-chain in common (e.g., if The comparison between a chain and an unordered set of functions (i.e., c k and { f 1 , f 2 , ...}), instead, can involve the following operators: • correlation (∼): a chain is correlated to a set of functions if it contains some of those functions (e.g., • disjointness (⊥): a chain and a set of functions are disjoint if they do not share any function (e.g., {< mn, M N >}); • inclusion (⊂): a set of non-ordered functions is included into a chain if all of its functions are part of the chain (e.g., if It is interesting to note that in some particular cases, the inclusion and dominance operators take the same meaning.Let us consider, for example, that one wants to specify the condition that a network function f belongs to a chain c.This condition can be expressed either by the com- By also supporting the inclusion operator (⊂), we then make the formula syntax less complex and less likely to be For SFC comparisons too, the model oers the negative forms of the aforementioned operators (e.g., noncorrelation ( ), non-dominance ( ), etc.), and some combinations of operators (i.e., equivalence or dominance ( ), equivalence or correlation ( ) and inclusion or equivalence (⊆)).The support of negative and combined operators enriches the expressiveness of our model in describing the anomalies to check.Even though network operators could exploit existing formalisms like Merlin and FatTire for specifying forwarding policies, they will miss the high-level exibility oered by our model in dening the anomalies to check.A feature we oer is the possibility to dene both a positive and a negative form of an anomaly (e.g., Traffic from User1 passes through FW" or does not pass through FW").
In order to further enlarge expressivity, we dene also another new operator that lets us specify comparisons related to the position of a function within a service chain: • π( f , c) returns the position of function f within chain c, if { f } ⊆ c, or 0 otherwise.Let us consider for Finally, the model also allows one to check the membership of a chain c k within a set of chains C i : • membership (∈): this boolean operator returns true if a chain c k belongs to the set of chains C i of the forwarding rule r i and false otherwise.For example, then c ∈ C is true.The model supports also the negative form of this operator (i.e, ).

Anomaly model
Formally, an anomaly a ∈ A is a predicate dened on one or more rules, where A is the whole set of forwarding anomalies (or simply anomalies) dened by a network operator.
For example, if r is a variable that represents a rule, an anomaly can be formally represented by a function a(r) that returns the boolean true if the anomaly is present in the single rule r and false otherwise.Similarly, if r i and r j are two rules, an anomaly can be dened as a function a(r i , r j ) that returns the boolean true if the anomaly is present in the pair of rules (r i , r j ).More generally, an anomaly that involves n r 1 , r 2 , ..., r n , can be represented by a function a(r 1 , r 2 , ..., r n ) that returns the boolean true if the anomaly is present in this set of rules and false otherwise.A policy is anomaly-free if : according to the arity of a.
In detail, an anomaly is formally specied by a set of Horn clauses that involve pairwise comparisons.Each clause is a conjunction of positive comparisons x i on rule elds and chains, which implies the presence of the anomaly in a single rule or in a pair of rules.Hence, the structure of Horn clauses that dene anomalies is as follows: In practice, the intersection of the sets of rules identied by the comparisons that occur in the left hand side of the formula is the set of rules in which the anomaly is present.
In order to be exible enough, the model supports also existential (∃) and universal (∀) quantication over SFCs in the left hand side of the Horn clauses to specify that some comparisons have to be satised by at least one or by all the SFCs of a forwarding rule

.
An example of anomaly that refers to pairs of rules and that uses universal quantication is the rule duplication anomaly, which occurs when a policy includes two identical rules.This anomaly can be specied as the anomaly that is true when the pairwise equivalence between all the elements of two rules expressed in the header-based representation (r i and r j ), including all the SFCs, is satised: 5 In this model, when we quantify universally on pairs of chains, we are considering implicitly pairs of dierent chains.For example, in case we check the correlation among the SFCs in a forwarding rule, we can specify c k i ∼ c l i , ∀c k i , c l i ∈ C i to check only pairs of dierent chains, without indicating explicitly that k l.As anomaly specications are FOL formulas with predicates dened over policy rules, they inherit the standard formal semantics of FOL formulas, while the semantics of predicates is determined by the semantics of the relational operators we have dened.The presence of anomalies can be detected in a policy specication by means of general purpose tools such as automated theorem provers or production rule systems, as it will be shown in Section 7. The correctness of verication results depends on the correctness of such tools, which is generally assumed.

Anomaly Types
In the this subsection, we identify a number of types of anomalies, giving examples for each of them.Some of these examples can be considered as part of the set of predened anomalies.Note that the pre-dened anomalies that we present as examples are not exhaustive, since we do not claim to cover the whole set of anomalies that arise in the SFC domain.The main aim of this paper is to enable the formal specication of policy rules and anomalies and the detection of a consistent number of anomalies as early as possible.Note also that the proposed anomalies do not include those anomalies that are combinations of the presented types.Of course, the model is able to treat and detect also this kind of anomalies.However, we preferred to limit our presentation to these type of anomalies, and encourage network operators to build customized ones.
The types of anomalies that we are proposing are based mostly on the object of comparison rather than on the comparison operator occurring in the formula.For this reason, in the following formulas that we use for dening anomaly types, we leave the operator unspecied and we indicate it generically by the symbol.

Single-Field anomalies
The anomalies in this class are those that involve only comparisons between single network elds and specic values.Thus, following the generic anomaly structure dened in (3), a comparison x that composes a single eld anomaly is expressed as: x = n v Where the symbol v represent any well-formed values that it may be compared with the network eld.
Examples of anomalies of this class are the ones triggered when the sender user refers to a non authorized name (e.g. a name u that is in the list of unauthorized names FakeUser s") or when port numbers are grater than their maximum values.Such kind of anomalies belong to the set of pre-dened anomalies, since they are mistaken policy specications in every network topology.They can be expressed by the following formulas: usr_src = u|u ≺ FakeUser → BadUserSrc(r) (5)   port_src > 65535 → BadPortSrc(r) (6) port_dst > 65535 → BadPortDst(r) ip_src = ip_dst → BadI pAddress(r) Of course, these are other examples included in the predened set of anomalies supported by our model.

Node Traversal anomalies
These anomalies are those that arise when a trac ow can (or cannot) traverse one or more network functions.
In this model, we specify the trac ow by comparing network elds with specic values (i.e., x = n v).
Hence, such anomalies are expressed by comparing network elds with specic values and chains with ordered or non-ordered sets of functions.The forms of comparisons in these anomalies are: where c is a generic chain specier which can be existentially or universally quantied.In order to identify when a custom anomaly like Web traffic does not pass π( f w , c)) and they may be required to hold for at least one or for all the chains of the forwarding rules that manage that ow.Of course, in order to express the ow for which the constraint is checked, network eld comparisons can be used.Hence these anomalies are specied by formulas including the following comparisons: An example is when we want to ensure that a NAT is always congured to process trac before a rewall.This means that we have to detect the anomalous situation when a NAT is located after a rewall in the SFC topology, which can be done by the following anomaly denition based on the position operator: the equivalent formula in the header-based representation: Furthermore, another example is when we want to ensure that all trac to web servers passes through web caches.The formula can be expressed as follows in the header-based representation:

.5. Chain Constraint anomalies
This category includes anomalies that can be detected by comparing the chains of a set with one another.(e.g., some chains in a forwarding rule are equal).Thus such anomalies contain comparisons between SFCs (c k c w ) that belong to the same forwarding rule, and network elds comparisons to identify the ow (n As an example, let us consider a network graph where that a web trac is balanced on two chains (Figure 1) and let us assume that we want to require that web trac is processed either by the same (i.e., equivalent) chains or by a similar (i.e.correlated or dominated) chains.This means that the two chains must not be disjoint and we can detect this anomalous situation by means of the following header-based formula:

Sub-Optimization anomalies
Such anomalies aim at detecting under-optimizations of the policy specication and thus situations where more forwarding rules can be substituted by a single rule.In order to detect such anomalies, it is necessary to discover the forwarding rules that have the same sets of chains and properties.Hence the anomalies in this class include the following comparisons: Under this class, we include the duplication anomaly dened in (4).The following formulas are another example, where we detect those forwarding rules that refer to completely disjoint trac ows but that enforce the same set of SFCs:

Conicting anomalies
In our model, conicts arise when two forwarding rules manage the same trac ow but they do not specify the same sets of chains.If the two rules are installed into the network, inconsistencies in the trac forwarding can be generated at run-time.Hence a formula for detecting this kind of anomaly includes the following comparisons: The comparisons x that compose a conicting anomaly have been selected so as to enable the specication of dif-ferent types of relationships between two sets of chains.
An example of relationship is the case in which two sets C i and C j contain the same SFCs or also when C i contains all the chains of C j as subset.Another case is when C i and C j do not have any chain in common.This means that the policy contains two forwarding rules that forward the same trac ow to dierent sets of SFCs.This kind of conictual anomaly can be detected for example by the following formula: Note that some cases of conicting" forwarding rules according to the anomaly model ( 12 In this example, r j contains an additional SFC with respect to r i , but this kind of policy (even if it is ambiguous and non-optimized) may not be a conict because, for example, each of those chains contains a VPN functionality and the operator does not want to be advertised in such cases.
In this paper, we consider as pre-dened conictual anomaly only the case when the two forwarding rules do not have any SFC in common, as dened in (13).All the other possible conictual anomalies have to be specied by the network operators and are classied as custom anomalies (Table 4).An example of operator-dened conicting anomaly could be the case of a trac ow that is managed by two forwarding rules that enforce correlated" sets of SFCs (i.e., the two sets share some SFCs but they are not the same): 6.1.8.Global Properties anomalies These anomalies arise when any ow violates some requirements specied by the network administrator about properties.Thus, a comparison x that is used to express this type of anomaly is takes the form: x = p v p An example of anomaly belonging to this class is when the network administrator wants that any ow consumes less than 1000 Mbit/s of bandwidth: in this case, any policy requiring more than 1000 Mbit/s for a ow is anomalous.Such anomalies are considered as custom anomalies.
6.1.9.Specic Properties anomalies These anomalies are specic cases of the previous ones.
They arise when one or more specic chain (i.e. one that contains a specic functions) traversed by M, violates some requirements specied by the network administrator.The forms of comparisons in these anomalies are: These anomalies belong to custom anomalies.
An example is when all chains traversed by a http ow, does not have more than 1TB of Memory allocated for that ow.The violation of this requirement can be formalized as the following anomaly:

Implementation and performance
In order to evaluate our approach, we have implemented a Java-based prototype and tested its performance under dierent scenarios.We have run our tests on an Intel i7-4600U@2.10GHzworkstation with 8 GB of RAM.The main purposes of this experimental evaluation were: (i) validate our model in a real-case scenario; (ii) identify the main factors that inuence the verication performance; and nally (iii) evaluate verication time as a function of the most inuential factors.

Implementation
Our Java-based prototype exploits Drools [26], as verication engine.Drools is a Rule-Based System that uses the ReteOO algorithm to perform the inferences [27].A Rule-Based System is a Knowledge-Based System that encodes information in the form of rules.Listing 1 shows the structure of generic rules in the Drools language.
In our prototype, every Drools rule represents an anomaly, while forwarding rules are implemented as Java objects against which Drools rules are checked.As an example, Listing 2 contains a rule used to check if a forwarding rule presents a wrong source port denition (i.e., there is a source port anomaly).In other words, the meaning of Rule in Listing 2 is: When: • there is a forwarding rule f with source port less then zero (lines 4 to 5); • in the anomaly set as (lines 6 and 10) there are no other anomalies, of type wrong source port (line 7), related to f (line 8). Then: • a new wrong source port anomaly related to f is created (lines 11 to 14).

Validation and Performance Evaluation
For validation, we used a sample network scenario obtained, by approximating part of our campus network.In order to evaluate the performance of the proposed approach, some tests were run using a number of synthetically generated scenarios, so as to have a rough estimation of processing times in dierent cases.3 and 4, where, for each anomaly, we present a possible formula that detects that anomaly for a specic ow.

Such scenario (shown in
Moreover, in the automatic generation process, we have set a threshold on the percentage of forwarding rules (with respect to the total rule set) that trigger an anomaly.Fig- ure 3a shows that, for each rule set-size, we have evaluated the elapsed time with the following percentages of anomalous rules: 10%, 20%, 50% and 80%.
The obtained results indicate that the elapsed time to complete verication grows linearly with the number of forwarding rules.This is highlighted in the four test scenarios.Each measured time has have been averaged on 100 test-runs.Verication time grows up to 340ms in the worst case (the solid line in Figure 3a).
In order to also evaluate the dependency of verication time on the percentage of forwarding rules that trigger an anomaly, we report another plot (Figure 3b).Keeping constant the number of forwarding rules, we plot verication time for percentages of anomalies growing from 0% to 100%.The plot in Figure 3b, shows the behavior for a number of forwarding rules set to 100, 300, 500, 700 and 1000 rules.Once again the dependency is linear.
As we can note from the achieved results (Figure 3b), the percentage of forwarding rules that bring to an anomaly has a greater inuence on verication time when the rule Conicting a single ow is forwarded to dierent chains see (13) set size grows.This can be conrmed by comparing the trend in the case of 100 forwarding rules, where the elapsed time is almost constant, and in the case of 1000 rules, where the elapsed time grows more rapidly with the increment of the anomaly percentage.
The achieved results are also conrmed in an additional  chains Conicting a single ow is forwarded see (14) to dierent chains test case (Figure 3c), where we have evaluated verication time with a growing number of anomalous" rules in different sized rule-sets (i.e., 100, 300, 500, 700 and 1000 forwarding rules).Also in this test-scenario, it is evident that the performance of our verication approach is inuenced by both the number of forwarding rules and the percentage of these that trigger an anomaly.
Moreover, we can also note that the verication time is in the range between 340ms, in the worst case with 1000 forwarding rules and 80% of anomalous" rules (Figure 3a), and 400ms, when each one of the 1000 forwarding rules triggers an anomaly (i.e., the solid lines in Figure 3b and Figure 3c).
The achieved results show that our verication approach takes a time in the order of hundreds of milliseconds in the case of a real-sized network with a growing number of trac ows and time increases linearly with the complexity.Moreover, we measured the memory required by our tool during the execution of synthetically generated scenarios.We noted that the memory consumption was approximately the same in all scenario and in the worst case (i.e., 1000 forwarding rules with 80% of anomalies) it was approximately 1GB.
Considering the performance in terms of time and memory, it is reasonable to use our approach in a real network scenario.

Conclusion
In this paper we have proposed a formal approach to specify and verify SFC policies.According to the proposed approach, the presence of anomalies in a forwarding policy can be detected before deployment, i.e. before the policy rules are enforced by the SDN Controller and installed into the network switches.In order to achieve this goal, we have designed a two-fold formal representation of the forwarding policy that characterises packet forwarding in the network (i.e., in terms of trac ows and service chains) and of the set of anomalies that have to be detected against the   policy rules.This has been done using standard notations such as First Order logic and Horn clauses.This formal approach enables precise and unambiguous specications of policy rules and of related anomalies, and, through the application of already existing verication engines, it allows rigorous verication of the absence of anomalies and the consequent guarantee that a veried policy is anomaly-free, under the assumption that the verication engine is correct.
Moreover, the proposed model is highly exible and extendible, because it allows network operators to dene their own sets of anomalies.A minimum level of correctness in the network can be always guaranteed, by having a core set of pre-dened anomalies (e.g., capturing bad policy rule specications or forwarding loops), which can then be extended by the operators.
In order to prove the usefulness of this approach in a real network scenario, we have implemented and tested a Java-based prototype of our verication model, exploiting Drools as inference engine.We have achieved verication times in the magnitude of milliseconds for networks of reasonable size.This evaluation has been performed under dierent conditions created by increasing the number of forwarding rules congured in the considered network use-case and the percentage of such rules that trigger an anomaly.
For the future, we plan to extend the expressiveness of the model by considering also the congurations installed into the network functions that make up the service chains.In some cases, ow forwarding may depend also on the packet processing performed by the network functions, which, in turn, depends on the function congurations.
Moreover, in order to further improve the usability of the approach, in the future we aim to extend the proposed approach by exploiting techniques for the automatic renement of anomaly rules, starting from high-level requirements, and automatic strategies for the update of policy rules when anomalies are detected.
Finally, the proposed model could become a wider, and more ambitious contribution.Since policy-based systems are largely widespread in data protection, ltering, access control, and many other policy domains, a useful contribution can be to extend this verication model in order to encompass dierent policy domains.An extended verication model could verify that a domain-specic policy is consistent also in the presence of policies belonging to other domains.
which enables network operators to dene forwarding policies with bandwidth constraints.All such languages oer the possibility to perform some static checks on policy descriptions, but they all miss an underlying formal model of policy and anomaly, and each one of them performs only some specic and xed checks over a forwarding policy (e.g.Merlin checks only if a policy modication introduced by tenants includes the chains enforced in the original policy set down by the operators; FML rst detects a xed set of conicts and then it xes them by exploiting resolution techniques; FatTire generates network congurations that are conict-free by construction but does not provide other forms of checking).
the g-th network eld in r i v g i is it is the value associate to n g i c k i species the k-th chain in the i-th rule f w k i is the w-th function in c k i M trac ow managed by the rule C SFCs that M can potentially traverse at run-time P properties associated to the ow M N have dened a set of relational operators to compare network elds in the name-based representation and in the header-based one.Even though the value-type of a network eld may depend on which one of the two representations is used, the meaning of our relational operators remains unchanged.In order to perform a pairwise comparison between network elds and, in turn, to check when one or more forwarding rules match an anomaly, we need to establish inclusion relationships among network elds.The inclusion relationships involving sets of values and single values derive naturally from the set inclusion concept.For example, in the header-based representation, a range of network addresses includes single IP addresses (similarly for a port range and single port numbers).However, in the name-based abstraction, where symbolic names are used, we may need to extend the inclusion relationship by also considering the meaning of symbolic names.For example, for the tr f _t ype eld, which species one or more network protocols, we suppose to use the native inclusion relation among the network protocols over the dierent levels of the ISO/OSI stack.In particular, a network protocol of a stack layer includes a set of protocols in the above layer and it is also included by a protocol of the underlying layer.For example, the Layer 4 TCP pro- tocol includes many Layer 7 protocols (e.g.,HTTP, FTP)

80
and port_dst = 80, then, for rule r, which includes M H , we have port_src = port_dst.While, if M N has usr_src = Alice and usr_dst = Alice, similarly to the previous case, usr_src = usr_dst.
Since we aim at checking the presence of anomalies among forwarding rules expressed both in the name-based and in the header-based abstraction, our model needs to express anomalies with reference to both abstraction levels.To better understand how anomalies are expressed in both formalisms, let us consider another example of anomaly with reference to Figure1.A network operator, who wants to make sure all web trac between user Alice and Google servers traverses a rewall, can dene a custom anomaly triggered if such web trac may not traverse a rewall.This anomaly, which involves a single rule, is expressed in the name-based representation as usr_src = Alice ∧ usr_dst = Google ∧ tr f _t ype = htt p∧ {< * , FW >} c k , ∀c k ∈ C → webNoFirewall(r) while in the header-based representation, if we assume Alice has IP address 130.192.225.116 and Google servers have IP 8.8.8.0/24, the same anomaly is represented as eth_src = * ∧eth_dst = * ∧eth_t ype = 0x0800 ∧ vlan_id = * ∧ ip_src = 130.192.225.116∧ ip_dst = 8.8.8.0/24∧ ip_proto = 0x06 ∧ port_src = * ∧port_dst = 80∧ {< * , FW >} c k , ∀c k ∈ C → webNoFirewall(r)

Listing 1 :
when // c o n d i t i o n s ( q u e r y l a n g u a g e ) t h e n // a c t i o n s ( j a v a ) end Structure of a rule statement in the Drools language.

1 r u l 2 /
e "SRC P o r t anomaly " / R u l e : wrong s o u r c e p o r t d e f i n i t i o n 3 when 4 f : F o r w a r d i n g R u l e ( 5 ( g e t P o r t S r c ( ) .g e t S t a r t V a l u e ( ) < 0 T y p e ( ) .compareTo ( " s r c p o r t ")==0 && 9 g e t R u l e I d ( ) .c o n t a i n s ( f .g e t I d ( ) ) 10 ) f r o m a s .g e t A n o m a l i e s ( e t T y p e ( " s r c p o r t " ) ; 14 a .g e t R u l e I d ( ) .add ( f .g e t I d ( ) ) ; 15 s .g e t A n o m a l i e s ( ) .add ( a ) ; 16 end Listing 2: An example of Drools rule.

Figure 2 )
has been manually setup with real data and policy rules.It contains about 35 clients, 15 servers, 10 network functions (i.e an IDS, a VPN Gateway, an Application Firewall, a Monitor, a Packet Filter, a Web Cache, an Anti Spam and two Load Balancers), and the policy contains 23 forwarding rules.In this scenario, we performed the validation in less than 5ms, by detecting 4 anomalies: 2 Single-Field, 1 Conicting, 1 Node ordering.
Verication time evaluated with a growing number of anomalies.

Figure 3 :
Figure 3: Evaluation times of forwarding policy verication.

Table 1 :
List of acronyms used in this paper.

Table 2 :
List of notations and symbols used in this paper.
M H currently relies on a set N H of OpenFlow elds n h , but of course this set can be extended too.More precisely, N H is currently dened as follows: N H = {eth_src, eth_dst, eth_t ype, vlan_id, ip_src, ip_dst, ip_proto, port_src, port_dst} Similarly to the M N model, each n h is characterised by a type and, in order to specify a ow, it must be assigned either a single value or a set of values or all values ( * ).A set of values can be expressed as a range in case the type is a totally ordered set.For example, in a ow specication we can use ip_dst = 8.8.8.0/24 or port_dst = [80, 100], because the types of IP address and port number elds are totally ordered sets of values, but if we prefer we can also use single values (e.g., ip_dst = 8.8.8.151 or port_dst = 80).Hence, a ow M H ∈ M H is formally dened as follows: Another example is if tr f _t ype i = {htt p, htt ps} and tr f _t ype j = {htt p}, then tr f _t ype i tr f _t ype j .Inclusion relations over names are also taken into account.For example, let us suppose tr f _t ype k = {tcp} and tr f _t ype j = {htt p}, then tr f _t ype k tr f _t ype j thanks to the above mentioned inclusion port_src i ⊥ port_src j .Similarly, if tr f _t ype i = {htt p, htt ps} and tr f _t ype j = {imap, imaps}, then tr d_t ype i ⊥ tr f _t ype j .
a eld dominates another one if it can take all the values that can be taken by the other eld.For example, if port_src i = [1024, 2048] and port_src j = [1024, 1500], then port_src i port_src j .ues.In this case, a network eld is greater than another one if their values have this relation (e.g., if port_src = 70001 and port_dst = 65535 then port_src > port_dst); • correlation (∼): two elds are correlated if they share some values but none dominates the other.Generally, this operator makes sense for elds that can take sets of values rather than single values.For example, if port_src i = [1024, 1500] and port_src j = [0, 1100], then port_src i ∼ port_src j .Similarly, if tr f _t ype i = {pop3, imap, snmp} and tr f _t ype j = { f t p, snmp}, then tr f _t ype i ∼ tr f _t ype j ;• disjointness (⊥): two network elds are disjoint if they do not share any value.For instance, if port_src i = [1000, 1024] and port_src j = [1100, 8080],then joint or one can dominate the other).

Table 4 :
Others custom set of anomalies.