Symbolic Approach to the Analysis of Security Protocols

: The speciﬁcation and validation of security protocols often requires viewing function calls – like encryption/decryption and the generation of fake messages – explicitly as actions within the process semantics. Following this approach, this paper introduces a symbolic framework based on value-passing processes able to handle symbolic values like fresh nonces, fresh keys, fake addresses and fake messages. The main idea in our approach is to assign to each value-passing process a formula describing the symbolic values conveyed by its semantics. In such symbolic processes, called constrained processes , the formulas are drawn from a logic based on a message algebra equipped with encryption, signature and hashing primitives. The symbolic operational semantics of a constrained process is then established through semantic rules updating formulas by adding restrictions over the symbolic values, as required for the process to evolve. We then prove that the logic required from the semantic rules is decidable. We also deﬁne a bisimulation equivalence between constrained processes; this amounts to a generalisation of the standard bisimulation equivalence between (non-symbolic) value-passing processes. Finally, we provide a complete symbolic bisimulation method for constructing the bisimulation between constrained processes.


Introduction
The sudden expansion of electronic commerce has introduced an urgent need to establish strong security policies for the design of security protocols.The formal validation of security protocols has since become one of the primary tasks in computer science.In recent years, equivalence-checking has proved to be useful for the verification of security protocols [1,4,6,19].The main idea behind this approach of formal verification is to verify a security property by testing whether a process (specifying a protocol) is bisimilar to its intended behaviour.The success of these methods relies on two facts: 1) process algebras are suitable for the specification of such protocols, including cryptographic protocols; 2) bisimulation offers an expressive semantics to process calculi.Many other methods inspired by a wide range of approaches have been proposed in the literature to analyse security protocols, but very few offer the possibility to explicitly analyse function calls used, for example, in encrypting, decrypting, signing and hashing.In cryptographic based process calculi like Abadi & Gordon's spi-calculus [2] and Focardi & Martinelli's CryptoSPA [11], encryption and decryption manipulations are done in a parallel inference system, and therefore they are not directly observable from the process semantics.For instance, a principal sending a message m encrypted with a key k is modeled as an output action "c({m} k )" (where {m} k stands for the message m encrypted by k) whenever {m} k can be inferred from the principal's current knowledge.
However, information flow properties (e.g.non-interference [10] and admissible interference [18]) usually require such manipulations to be observable.For that purpose, we work within the framework of an extension of value-passing CCS [16], called Security Protocols Process Algebra (SPPA) [14], in which function calls made by principals are explicitly modeled as actions.For instance, a principal sending a message m encrypted with a key k is modeled as the action "enc id "( w h e r eid is an identifier for the principal encrypting the message) followed by the output action "c({m} k )".Moreover, the specification of intruders in SPPA allows us to analyse the effects on the information flow of a protocol of an intruder generating fake messages and fake addresses.In addition, compared with a process calculus using an inference system for encryption manipulations, SPPA is more suited for analysing restricted attacks based on the repetition of the same attempt.For instance, distributed denial of service attacks have been specified in SPPA [14].In order to deal with the notion of fake message, around which most attacks are built, we need to extend SPPA in order to specify functions generating random values.But the introduction of such generating function calls as actions requires interpreting their output as symbolic values.Thus, we need to consider symbolic value-passing processes along with a symbolic operational semantics able to handle symbolic variables without a specific value but satisfying certain constraints.
This paper introduces a symbolic framework for the specification of security protocols which is based on the novel concept of constrained process.Ac o nstrained process is a pair composed of a value-passing process (SPPA process with, possibly, free variables standing for symbolic values) and a formula expressing a statement about symbolic values.The formula pertains to a message logic whose terms are taken from a message algebra relying on atomic sets of numbers and identifiers (addresses), and cryptographic primitives (encryption, signing and hashing operators).Therefore, the purpose of a formula within a constrained process is to bind the free variables occurring in the course of process execution.For instance, to a process generating a fresh key for a protocol run and allocating this key to some free variable x, we assign the formula which states that x stands for a key.The operational semantics of constrained processes is thus achieved from the process behaviour, subject to the restrictions imposed by its formula.Hence, a process whose definition requires the execution of an action and evolution into another process will only occur if the whole transition satisfies the formula enforced at this point.Roughly speaking, the formula within a constrained process stands for the set of messages that can be assigned to its free variables; this set of possible values evolves, along with the process, by either adding new free variables or restricting (or binding) the ones already present.In one of the main results of this paper, we prove the decidability of every formula derivable from the process algebra's operational semantics.Moreover, we feel that our symbolic framework can be applied to any other process algebra, from value-passing CCS to more expressive process algebras like Milner's π-calculus [17] and Abadi & Gordon's spi-calculus [2].
The use of value-passing processes over infinite messages-domain leads to non finite-branching transition graphs on which trace equivalence and bisimulation equivalence fail to be decidable.An attractive solution to this challenge was proposed by Hennessy-Lin [12] who defined a notion of symbolic bisimulation.It is primary based on a symbolic semantics which may express value-passing CCS processes in terms of finite symbolic transition graphs instead of possibly infinite ones.The main idea behind Hennessy-Lin's approach is to assign to every action (transition) a formula describing the symbolic values (free variables) used in the action.Within this framework, they introduce two generalisations of Milner's strong bisimulation equivalence for value-passing processes called early and late bisimulation.Although our paper aims at a similar goal, we introduce a symbolic semantics in which the description of symbolic values is done within the processes (states) instead of within the transitions.In fact, our symbolic transition graphs could be directly obtained from their symbolic transition graphs (by considering every path).Moreover, our approach, compared to Hennessy-Lin's, takes advantage of an expressive message logic capable of stating cryptographic relations.For instance, we can bind free variables x 1 ,x 2 ,x 3 through the formula (x 1 == {x 2 } x3 ) ∧K(x 3 ) which states that x 1 stands for x 2 encrypted with the key x 3 .In addition, we feel that the concept of constrained process is more suited for security protocol analysis than Hennessy-Lin's symbolic transition graph: a constrained process allows us to get a quick view of the symbolic values at a given state of the protocol, rather than retrieving successively every path leading to this state.
This paper is organised as follows.In section 2, we introduce a logic for cryptographic messages.In section 3, we present the SPPA process algebra and we describe a symbolic semantics for constrained processes.In section 4, we introduce a bisimulation equivalence relation for constrained processes, for which we give, in section 5, a sound and complete proof method called symbolic bisimulation.In section 6, we offer a brief overview on the application of our symbolic framework to security protocols analysis.We conclude this paper with a short talk on related work and on our future work.

Message Algebra
We consider the following message algebra which relies on disjoint syntactic categories of numbers, principal identifiers and variables respectively ranging over sets N , I and V.T h es e tT of terms is constructed as follows: It is important to note that we only consider finite terms.For any term t, we denote fv(t) the set of variables occurring in t and we say that t is a message whenever it contains no variable.The set of all messages is denoted by M.F u r t h e r m o r e ,g i v e nav a l u a t i o n̺ : V→Mand a term t such that fv(t)={x 1 ,...,x n }, ̺(t) stands for the message t[̺(x 1 )/x 1 ] ...[̺(x n )/x n ] i.e., the message obtained from t by substituting each variable x i with its valuation ̺(x i )( i =1 ,...,n).Note that if a variable is substituted more than once in the expression t[̺(x 1 /x 1 ] ...[̺(x n )/x n ], then the lefter-most substitution always prevails.
For the sake of clarity, we will discriminate a subset K⊆Mof messages that may be used as encryption keys.Note that the definition of the set K usually depends on the cryptosystem used by the protocol.For instance, in the case of a symmetric block-cypher algorithm, we have K = {v ∈N |length of v = N } for some N ∈ IN; or, more generally, we may have K = N∪ m≥1 {h m (n) | n ∈N} where we write h m (n) instead of h(...h(n) ...)(m times).However, for simplicity purposes, this paper simply uses the set K = N .Moreover, in order to deal with public-key encryption, we use an idempotent operator [−] −1 : K→Ksuch that a −1 denotes the private decryption key corresponding to the public encryption key a, or vice versa.For symmetric encryption, let a −1 = a.Moreover, one assumes perfect encryption and hashing.

A Logic for Messages
In the following, we consider the logic based on the terms of our message algebra and the following predicates: The formulas of our logic are then obtained as follows: The set of φ's free variables is denoted by fv(φ)a n dφ is said to be closed whenever fv(φ)=∅.The satisfaction of a closed formula φ, denoted by |= φ,i s defined recursively as follows: -|= 1 and |= 0 -|= a == b iff messages a and b are syntactically identical i.e., •| = n == n for every n ∈N, •| = id == id for every id ∈I, -|= M(a) for every message a ∈M; -|= N (a)i ffa ∈N; (Notation φ[a/x] stands for the substitution of every free occurrence of variable x in φ, by message a.) We assume that each predicate is decidable i.e., the satisfiability problems |= N (a), |= I(a)and|= K(a) are decidable for any a ∈M, and they are never satisfied whenever a is a non-atomic message (recall that we assumed earlier that K = N ).For instance, |= I(h(a)) and |= N ({a} b ) for any a, b ∈M.Moreover, we recall that I and N are disjoint sets.Given a valuation ̺ : V→M , the satisfaction of a formula φ by ̺, denoted by ̺ |= φ, is defined as follows: with fv(φ)={x} states that variable x must be a couple composed of a key and a message.Hence, if Two formulas φ and φ ′ are said to be equivalent -which is denoted by φ ⇔ φ ′ -whenever for every valuation ̺.In particular, a formula φ is equivalent to 0 -w h i c hi s denoted by φ ⇔ 0 -whenever ̺ |= φ for every valuation ̺.We also consider the equivalence relation φ V ⇔ φ ′ defined as follows: The equivalence class of some formula φ under

Decidability
We can prove that every (closed) formula from our logic is decidable.This result follows from the fact that our logic for messages is restricted to conjunction and existential operators.Hence, the decidability of a formula boils down to the satisfaction of a number of predicates and equations, which we assume to be decidable.
Proof of Theorem 1 is given in Appendix A.

Functions
We consider a finite set F of functions mapping messages to new messages constructed from the grammar rules above.Each function f (x 1 ,...,x n )h a sa characterisation formula from our logic, denoted by φ f (x1,...,xn) or simply φ f , which is satisfied only by messages within its domain, and such that fv or simply φ f (a)w h e r ea =( a 1 ,...,a n ).In addition, we consider the notion of generating functions, extremely useful for the specification of security protocols requiring fresh nonces, fresh keys and random numbers, and for the specification of intruders generating fake addresses and fake messages.Generating functions are functions which may generate symbolic values without any input.Each generating function new ∈F (often denoted by new(−)) is assigned to a formula φ new , also called characterisation formula, which is satisfied only by messages within its range.We usually consider the following functions: • newId(−)w i t h φ newId ::= I(x); • newKey(−)withφ newKey ::= K(x).

Security Protocol Process Algebra
Our first step toward validation of security protocols is to find a language which may express both the protocols and the security policies we want to enforce.Process algebra has been used for some years to specify protocols as a cluster of concurrent processes, representing principals participating in the protocol, which are able to communicate in order to exchange data.Process algebras CSP [13] and CCS [16] have been extensively used with this objective [15,20].In this section, we introduce a generic symbolic framework which we feel could be applied to numerous process algebras.Given a process algebra, we proceed by extending its syntax in order to view generating function calls "let x = new(−) in ..."a s prefixes.Messages generated in this way are explicitly typed with the characterisation formula φ new .For simplicity, our symbolic framework follows the Security Protocols Process Algebra (SPPA) [14], an extension of value-passing CCS in which local function calls are viewed as visible actions.Up to these extensions tailored just to fit to the ideas presented here, SPPA is very similar to SPA presented by Focardi & Gorrieri [7].SPPA's syntax follows Abadi & Gordon's Spi-Calculus [2], but without scope extrusion and replication, and the input and output prefixes do not carry channels.Also, the purpose here is not to introduce a new process algebra but just to define a generic process algebraic symbolic framework as well-suited as possible to analyse cryptographic protocols.

Syntax of SPPA
First, we consider a finite set C of public channels.Public channels are used to specify message exchanges between principals (commonly, there is one channel for every step of a protocol run).We assume that public channels have no specific domains: any message can be sent or received over them.
The agents of SPPA are constructed from the following grammar: where L is a set and O is a partial mapping (both to be clarified in Section 3.2).Whenever f is a generating function, we usually write let x = f (−) in S.
In order to prevent name clashes for variables (e.g. to prevent the sum or parallel composition of agents having free variables in common), we assume that a variable is never used twice to define agents (renaming variables when necessary).Given an agent S, we define its set of free variables, denoted by fv(S), as the set of variables x appearing in S which are not in the scope of an input prefix c(x), a pair splitting let (x, y)=t in, a function call let x = f (t) in,o r a decryption case {t ′ } t of {x} t in; otherwise the variable x is said to be bound .Given a free variable x ∈ fv(S)a n dat e r mt, we consider the substitution operator S[t/x] where every free occurrence of x in S is set to t.Aclosed agent is an agent S such that fv(S)=∅.
AS P P Aprincipal is a couple (S, id)w h e r eS is an agent and id ∈I .T h e purpose of this notation is to relate an SPPA agent S and its sub-agents, to their unique owner (principal) via its identifier id.When no confusion is possible, we often use A as a reference to the principal (S A ,id A )w h e r eS A is the initial agent of A i.e., the agent specifying the entire behaviour of the principal A within the protocol.Moreover, we commonly make use of the identifier id A as a message containing its address, while we simply use A to refer to the principal's entity (i.e. the party involved with the protocol).For simplicity, given A 1 ::= (S 1 ,id)a n dA 2 ::= (S 2 ,id)( t h e ym u s th a v et h es a m ei d e n t i fi e r )w eo f t e nw r i t e [t = t ′ ]A 1 instead of ([t = t ′ ]S 1 ,id), A 1 |A 2 instead of (S 1 |S 2 ,id), A 1 + A 2 instead of (S 1 + S 2 ,id), and so on.
In order to specify a security protocol in SPPA, we use the classic approach [9,20] of specifying the principals as concurrent agents.Given a principal A, SPPA processes are constructed as follows: where is an associative and commutative operator forcing communication over public channels.
A constrained process is an expression of the form P, φ where P is a process and φ is a formula designed to constrain the free variables occurring in P .C ommonly, notation P, φ stands for the pair (P, [[ φ]] ) , w h e r e [[ φ]] is the equivalence class of φ under the relation V ⇔ (see Section 2.2).Thus, if φ V ⇔ φ ′ (i.e.formulas φ and φ ′ are equivalent and have the same free variables), then the constrained processes P, φ and P, φ ′ are considered to be the same.

Example 2. Consider the following one-step protocol Message 1:
in which principal A generates a fresh nonce n A and sends to the principal B this nonce encrypted with B's public key k B .P r i n c i p a l sA and B are specified, respectively, as the SPPA principals A ::= (S A ,id A )a n dB ::= (S B ,id B ), where id A is A's identifier, id B is B's identifier, and the initial agents S A and S B are defined as follows: The protocol is then specified as the SPPA process P ::= A B,i nw h i c h principals A and B can communicate over the public channel c.Since the process P has no free variable, the protocol is then specified as the constrained process P, 1 .

Symbolic Semantics
The value-passing operational semantics of an SPPA process is defined in Appendix B. Note that this value-passing semantics is only defined for closed processes.Also note that, because of the value-passing semantics, the obtained transition graph could be infinite.In this section, we establish a symbolic operational semantics for constrained processes, which correspond to finite labeled transition graphs.
Given a term t,t h eactions of SPPA are defined as follows: For instance, function call action enc idA stands for principal A encrypting some message a with some key k; output action c idA ({a} k ) stands for principal A sending message {a} k over the public channel c; and decryption action dec idA stands for principal A successfully decrypting some message {a} k .The silent action τ is used to express non-observable behaviours.We often use C to denote both the set of public channels and the set of output and input actions.
In value-passing process algebra, communication is commonly expressed by replacing the matching output action and input action by the silent action τ .However, this interpretation of communication causes a drastic loss of information on the content of the exchanged values and the parties involved.Using a marker action δ(t) instead of τ in those situations helps to parry this problem.Marker actions are therefore introduced in an attempt to establish an annotation to the semantics of an SPPA process; they do not occur in the syntax of processes and their specific semantics restricts their occurrence in order to tag communications between principals.A marker action has three parameters: a principal identifier, a channel and a term (message).Roughly speaking, the occurrence of an output marker δ c idA (a) stands for "the principal A has sent message a over the channel c", and the occurrence of an input marker δ c idA (a) stands for "the principal A has received message a over the channel c".
We write Act to denote the set of all actions and we consider the set Act A of actions that may be launched by the principal A, defined by: An observation criterion is a partial mapping O : Act * → Act which intends to express equivalence between process behaviours.Two sequences of actions γ 1 and γ 2 are said to carry out the same observation α whenever γ 1 ,γ 2 ∈O −1 (α).Given a subset L ⊆ Act \{τ }, we consider the observation criterion O L defined as follows: Only behaviours from the set L are observable through this observation criterion.
In particular, we have a natural observation criterion O ActA∪C , often denoted by O A , describing the actions observable by a principal A.
The symbolic operational semantics for constrained processes is given in Fig. 1 and Fig. 2. It is inspired by Hennessy-Lin's symbolic operational semantics [12] where boolean values guarding actions are replaced by formulas φ restricting free variables within the processes.Note that any transition Rules Output and Input allow principals to, respectively, send and receive messages over public channels.Rules Function and Generator allow the execution of local function calls made by principals.Rule Split allows to extract pairs.Rules Decryption and Signature-Verif allow to, respectively, recover encrypted messages and verify signed messages.Rule Match allows the verification of equality between two messages.Rules Sum and Parallel allow the specification of non-deterministic sum and parallel product of agents (with matching identifier).Rules Protocol and Synchronisation allow the specification of protocols, where the operator is similar to a parallel product in which communication between principals is achieved (and forced) through public channels.Rules Sum, Parallel, Protocol and Synchronisation are assumed to be both associative and commutative (i.e.P +(Q + R) behaves as (P + Q)+R, Q + P behaves as P + Q, and so on).Moreover, recall that constrained processes P, φ and Q, ψ must be defined with different variables, thus fv(φ) ∩ fv(ψ)=∅.Rule Restriction interprets P \ L as process P with the actions in L forbidden.In the Restriction rule, we assume that formula φ L α (which forbids instantiations of α to be in L) is such that fv(φ L α )=fv( α).Moreover, we need to restrict rule Restriction to the sets L such that formula φ L α is definable within our logic.Finally, rule Observation interprets the observation of a process through an observation criterion O,wherethecomputation P, φ γ −→ P ′ ,φ ′ , for a sequence of actions γ = α 0 α 1 ...α n ∈ Act * , stands for the finite string of transitions satisfying P, φ α0 Thus, P/O L (where L is a set of actions) means P with the actions outside L ignored (set to τ ).
A constrained process P ′ ,φ ′ is a derivative of P, φ if there is a computation P, φ γ −→ P ′ ,φ ′ for some γ ∈ Act * .Hence, the set of P, φ 's derivatives is defined by The following theorem states that the transition graph associated to any constrained process is always finite.
Theorem 2. For every constrained process P, φ ,t h es e tD( P, φ ) is finite.Example 3. Consider the following SPPA processes: The semantics of the constrained process A, 1 is illustrated in Fig. 3. Notice that rule Match yields the transition Thus A 1 , M(x 1 ) ∧M(x 2 ) ∧ x 1 == x 2 and A 1 ,x 1 == x 2 correspond to the same constrained process.Similarly, the transition follows from the Input rule and the fact that where fv(B)={y 1 }.The symbolic semantics of the constrained process B, M(y 1 ) is given in Fig. 4, where φ ::

Symbolic Semantics vs Value-Passing Semantics
The relationship between the symbolic operational semantics of constrained processes and the value-passing operational semantics of processes (see Appendix B) is detailed in the following lemmas.Every sequence of transitions between SPPA processes can be unwound to a sequence of transitions between constrained processes.Conversely, every transition between constrained processes can be interpreted as a set of transitions between processes.
For the remainder of this paper, we will discriminate between two types of actions: actions c id (t), δ c id (t), δ c id (t), signv id and τ (denoted by α), and actions c id (x), f id , split id and dec id (denoted by β).From the symbolic operational semantics, we see that an action β introduces a new variable x (or two, x and y, in the case of action split id ), while an action α does not.In the following, we will assume that, given an action β, x is always the introduced variable.Moreover, for simplicity purposes, we will omit the case of action split id which introduces two variables and treat it as any other action β.I ti se a s yt os e e that this last assumption will not affect the results presented is this paper since complete proofs can be obtained by adding special cases for action split id .
Lemma 5. Let P, P ′ be SPPA processes, let α ′ = c id (t), δ c id (t), δ c id (t), signv id or τ ,f o rs o m et e r mt such that fv(t) ⊆{ x 1 ,...,x n },a n dl e tβ ′ = c id (x), f id , split id or dec id .Consider a, a 1 ,...,a n ∈M ,a n dl e tα = α ′ [a The proofs of Lemma 3, Lemma 4 and Lemma 5 are given in Appendix D.

Bisimulation Equivalence for Constrained Processes
In this section, we extend Milner's notion of strong bisimulation [16] to handle constrained processes.But first, in order to bind the variables of the compared constrained processes, we need to consider finite relations between their free variables.Recall that whenever we compare two processes P and Q,w ea l w a y s assume that no variable has been used in both definitions.

Relations Between Variables
Consider the family R of all relation between finite subsets of variables, hence For any variable x ∈Vand any relation R ∈ R, we consider the relation The relation R[[ ( x, y)]] is therefore obtained from R by, first removing every occurrence of x and y, and then adding (x, y).
A valuation ̺ is said to be consistent with the relation R ∈ R if ̺(x)=̺(y) whenever (x, y) ∈ R.G i v e nx, y ∈V, we define ̺[x/y] as the valuation obtained from ̺ by setting ̺[x/y](y)=̺(x)(and̺[x/y](z)=̺(z) otherwise).It is easy to see that the valuation ̺

Bisimulation
For the following definition, recall from Section 3.3 that, given an action β = c id (x), f id , split id or dec id , we assume that x is always the variable introduced by β.Definition 6. (Bisimulation) Let P, φ and Q, ψ be constrained processes and let R ∈ R be a full relation between fv(φ)andfv(ψ).A bisimulation between P, φ and Q, ψ with respect to R is a family of relations R = {R ̺ } ̺ ,f o r every valuation ̺, where each relation R ̺ ⊆D( P, φ ) ×D( Q, ψ ) × R satisfies the following conditions: where Two constrained processes P, φ and Q, ψ are bisimilar if there exists a bisimulation which relates P, φ and Q, ψ with respect to some full relation R between fv(φ)a n df v ( ψ).In that case, we write P, φ ≃ Q, ψ .

Equivalence of the Bisimulations
In the following theorem, we see that the bisimulation equivalence between SPPA constrained processes, as defined above, corresponds to the strong bisimulation equivalence between SPPA processes.Obviously, this result holds only when the SPPA processes under comparison are compatible i.e., for processes without free variables.(We say that an SPPA process P is closed whenever fv(P )=∅.)First, we define strong bisimulation between SPPA (value-passing) processes.Definition 7. A bisimulation between closed processes P and Q is a relation where α ∈ Act is any (variable-free) action.We write P ≃ Q whenever P and Q are related by some bisimulation.
Theorem 8. Let P and Q be closed processes.Then, P ≃ Q if and only if P, 1 ≃ Q, 1 .
Proof.First, assume that P ≃ Q and let R be a bisimulation between P and Q.Consider the family of relations R ′ = {R ′̺ } ̺ , where, given a valuation ̺,t h e relation R ′̺ is defined as follows: for every (P ′ ,Q ′ ) ∈R,( every formulas φ and ψ,withfv(φ)={x 1 ,...,x n } and fv(ψ)={y 1 ,...,y m }, and for every relation In the following, we show that R ′ = {R ′̺ } ̺ is a bisimulation between P, 1 and Q, 1 with respect to the empty relation ∅.To achieve this goal, we show that each relation R ′̺ fits the conditions from Definition 6.
In the following, we show that R ′ is a bisimulation between P and Q.

1174
Lafrance S.: Symbolic Approach to the Analysis of Security Protocols Now assume that P ′ Consider the valuation ̺ ′ defined as follows: ̺ ′ (x)=a and ̺ ′ (z)=̺(z)o t h e r w i s e .

Symbolic Bisimulation: A Proof Method for Bisimulation
In this section, we introduce a symbolic bisimulation relation for constrained processes which can be constructed within a finite number of steps.We also show that this symbolic bisimulation relation is equivalent to the bisimulation relation introduced in Definition 6.Our symbolic bisimulation may therefore serve as a sound and complete finite proof method for the bisimulation of constrained processes.

Equivalence Relation over Valuations
For the following, we consider two constrained processes P, φ and Q, ψ .W e also consider the following sets: Such sets of formulas and variables are finite (up to formula equivalence) since the sets D( P, φ )a n dD( Q, ψ ) are finite by Theorem 2. Now consider the equivalence relation over valuations defined as follows: for every z, z ′ ∈{x 1 ,...,x n }∪{y 1 ,...,y n ′ }, Also consider the equivalence class (with respect to P, φ and Q, ψ )o fa valuation ̺ defined as follows: Lemma 9. Let ̺ and ̺ ′ be valuations such that ̺ ≡ ̺ ′ .

Lemma 10. There are finitely many equivalence classes [[ ̺]] i.e., the set {[[ ̺]] | ̺ is a valuation} is finite.
Proof.Follows from the fact that the sets {φ ′ | P ′ ,φ ′ ∈D ( P, φ ) for some P ′ } and {ψ ′ | Q ′ ,ψ ′ ∈D ( Q, ψ ) for some Q ′ } are finite (up to formula equivalence).Therefore, using the notation established above, we see that there are at most 2 m equivalence classes for the relation: there are at most 2 m ′ equivalence classes for the relation: there are at most 2 (n+n ′ ) 2 equivalence classes for the relation: Hence, there are at most

Symbolic Bisimulation
Definition 11. (Symbolic Bisimulation) Let P, φ and Q, ψ be constrained processes and let R ∈ R be a full relation between fv(φ)a n df v ( ψ).
A symbolic bisimulation between P, φ and Q, ψ with respect to R is a finite family R = {R [[ ̺]] } ̺ , for every valuation ̺, where each relation R [[ ̺]] satisfies the following conditions: , for any action α = c id (t), δ c id (t), δ c id (t), signv id or τ ,a n da n ya c t i o nβ = c id (x), f id , split id or dec id , we have ,w h e r eP 2 and φ 2 are such that ̺ |= φ 2 and P 1 ,φ 1 and φ 2 are such that ̺ ′ [x/y] |= φ 2 and P 1 ,φ 1 We write P, φ ≃ s Q, ψ whenever there exists a symbolic bisimulation which relates P, φ and Q, ψ with respect to some full relation R between fv(φ) and fv(ψ).
The following theorem states that symbolic bisimulation is a sound and complete proof method for verifying bisimilarity between constrained processes.Theorem 12. P, φ ≃ Q, ψ if and only if P, φ ≃ s Q, ψ .
Proof.First, assume that P, φ ≃ Q, ψ ,a n dl e tR = {R ̺ } ̺ be a bisimulation with respect to some full relation R between fv(φ)a n df v ( ψ).For every equivalence class [[̺]], consider the relation ] for every ̺.Then it is enough to show that the (finite) family R ′ = {R ′[[ ̺]] } ̺ is a symbolic bisimulation with respect to R, therefore P, φ ≃ s Q, ψ .Indeed, given an equivalence class [[̺]], we see that the R ′[[ ̺]] satisfies every conditions from Definition 11.

If
There is a transition Conversely, assume that P, φ ≃ s Q, ψ ,a n dl e tR = {R [[ ̺]] } ̺ be a symbolic bisimulation with respect to some full relation R between fv(φ)a n df v ( ψ).

Consider the family R
.W es e et h a tR ′ is a bisimulation with respect to R,t h u s P, φ ≃ Q, ψ .Indeed, given a valuation ̺,w e show that the relation R ̺ = R [[ ̺]] satisfies the conditions from Definition 6.

Theorem 12 allows us to construct bisimulation between constrained processes using only finitely many valuations (one from each equivalence class [[̺]]
) .Therefore, it gives us a finite proof method for verifying bisimilarity between any two constrained processes.

An Example
Example 5. Consider processes A and B defined as follows: A ::= c(x 1 ).A ′ ,A ′ ::= c(x 2 ).A and B ::= c(y).B with id A = id B = id.The symbolic semantics of A, 1 and B, 1 are given in Fig. 5.We show that the constrained processes A, 1 and B, 1 are bisimilar.From Theorem 12, it is enough to construct a symbolic bisimulation between A, 1 and B, 1 with respect to the empty relation First we consider the sets of formulas occurring in D( A, φ )andD( B, ψ ): {1, M(x 1 ), M(x 1 ) ∧M(x 2 )} and {1, M(y)}.
Thus, we only need to consider sets of variables {x 1 ,x 2 } and {y}.Since these formulas are satisfied by every valuation, there are five equivalence classes with respect to ≡, namely those that cover all the possibilities of the variables having identical values (resp.different values): For each equivalence class [[̺ i ]], the steps for constructing the relation R [[ ̺i]] are illustrated in Fig. 6.Thus, for each class [[̺ i ]], we need go to through the - The algorithm halts (with success) when every triplet ( P, φ , Q, ψ ,R)h a s been processed, without any contradiction, for every relation R [[ ̺]] such that ̺ is consistent with R. In our example, the construction of the bisimulation halts since every triplet has been added whenever is was required.Therefore, we obtain the symbolic bisimulation R =

Security Protocols Analysis
In this section, we give a quick overview on how to use the symbolic framework introduced in this paper to analyse security protocols.First, we show how to specify a protocol using the concept of constrained process, and, secondly, we show how to specify security properties using equivalence-checking methods.

Protocol Specification
Process algebra SPPA along with the notion of constrained process offer a useful framework for the specification of security protocols, including cryptographic protocols.Starting from a protocol P written in a notation àl aA l i c ea n dB o b , the main idea behind our specification approach is to specify each principal involved in P as disjoint constrained processes.For instance, a principal A is specified as the constrained process A, φ A ,w h e r eA ::= (S A ,id A ), S A is the initial SPPA agent of A,andφ A is a formula characterising the principal's initial knowledge (specified as free variables within S A ).Note that a principal's initial knowledge (e.g. its private keys and the keys of other principals) are commonly implicitly specified within the initial agent S A as specific messages m ∈Mor keys k ∈K .Thus, an initial agent S A is generally closed (i.e.fv(S A )=∅) and, in that case, we have φ A ::= 1.However, our symbolic framework also allows for the specification of these initial knowledge as symbolic values (i.e.free variables).For instance, if fv(S A )={x} and x stands for A's private key in S A ,t h e nw e put φ A ::= K(x).
Given specifications of the protocol's principals, let say A, φ A , B, φ B and S, φ S , then the whole protocol is specified as the constrained process P, φ P with P ::= A B S and φ P :: The intruders, namely the principals attacking the security protocols, are specified similarly as the other principals.Hence, an intruder is specified as a constrained process E, φ E , commonly called enemy process,w h e r eE ::= (S E ,id E ), S E is the initial SPPA agent of E (i.e. the SPPA agent specifying the intruder's attack) and φ E is a formula characterising the intruder's initial knowledge (as above).From this notation, the protocol P being attacked by the enemy process E is then specified as the constrained process P E ,φ PE with P E ::= P E and φ PE ::= φ P ∧ φ E .

Equivalence-Checking
We achieve security protocols analysis through a verification method called equivalence-checking.The main idea is to verify whether the protocol always acts correctly within an hostile environment.Roughly speaking, given a protocol P , we need to verify if the protocol being attacked, specified as the constrained process P E ,φ PE , is equivalent to the protocol not being attacked, specified as the constrained process P, φ P .The equivalence relation used to compared the two constrained processes is a relation based on bisimulation (Definition 6), called O-bisimulation.
The concept of O-bisimulation [19], called O-congruence by Boudol [5], captures the notion of behavioural indistinguishability through an observation criterion O.Given an observation criterion O, we say that the constrained process P, φ is O-bisimilar to the constrained process Q, ψ whenever P/O,φ ≃ Q/O,ψ .I nt h a tc a s e ,w ew r i t eP ≃ O Q.
From the concept of O-bisimulation, security properties are captured through different interpretations of an information flow property called bisimulation-based non-deterministic admissible interference (BNAI) [19].In the following, we offer a quick overview of previously defined security property based on BNAI (see respective reference for further details).
Confidentiality [19].Protocol P, φ P preserves the confidentiality if, for every enemy process E, where O E = O ActE (see Section 3.2 for notation), Act secret is the set of actions containing a secret message, and Γ is a set of downgrading actions containing every encrypting action, hashing action and signing action (hence the actions causing admissible declassification of information).This confidentiality property requires that no intruder can discriminate, in an inadmissible way, the protocol's behaviour and the behaviour of the protocol exchanging no confidential information.
Authenticity [19].Protocol P, φ P preserves the authenticity if, for every enemy process E, where O auth = O Act auth ,t h es e tAct auth ⊆ Act contains actions describing critical states of a process (i.e. the actions that should not occur when the protocol is being attacked), and Γ ⊆ Act E is a set of admissible attacks containing intruder's actions corresponding to harmless interference (e.g.intruder receiving an invitation for a protocol run or initiating an honest protocol run).This authenticity property requires that no intruder can interfere in an inadmissible way with the protocol.
Denial of Service [14].Protocol P is robust against denial of service if, for every enemy process E, where O costly = O Act costly , Act costly is the set of costly actions (i.e.actions requiring large amounts of resources and which could lead to resource exhaustion for some principal), and Γ ⊆ Act E is a set of admissible attacks (defined similarly as above).This denial of service property requires no causal dependency between enemy behaviours and costly actions (hence, potentially exhausting actions) of other principals.

Future Work and Related Work
This paper presents a symbolic framework for the analysis of security protocols.It is based on a message algebra that handles cryptographic primitives and a logic over this message algebra.The notion of constrained processes is then introduced as a value-passing process paired with a formula.Processes are defined through SPPA, a process algebra which allows for the specification of local function calls as visible actions.SPPA also gives, through marker actions, a clearer view of communication between principals.Generating functions for random numbers, fresh nonces and fresh keys, are introduced into SPPA's syntax in order to specify intruders generating fake addresses and fake messages.
From SPPA symbolic semantics for constrained processes, we then establish a bisimulation equivalence.Apart from introducing a new symbolic approach, the major results of this paper are the decidability of every formula in our logic (Theorem 1), the finiteness of the symbolic operational semantics of any constrained process (Theorem 2), and the fact that the bisimulation equivalence between constrained processes corresponds to Milner's strong bisimulation between value-passing processes (Theorem 8).Another main result of this paper is a sound and complete proof method, called symbolic bisimulation, to check bisimilarity between constrained processes.The main difference between our approach and Hennessy-Lin's [12] is the symbolic transition graph: in our symbolic transition graph (symbolic semantics), we assign to each state (process) a formula giving a precise description of the free variables involved in the process; Hennessy-Lin's symbolic framework requires considering the formula built from some path leading to a given state (process).Our symbolic framework was developed with security analysis in mind -it is then essential to have an accurate description of the symbolic values at a given state in order to properly analyse a security protocol in a computer system.Indeed, security protocol analysis often requires checking the effect of random values (e.g.nonces, fresh keys or fake messages) on certain principals of the protocol.In this context, our notion of constrained process allows us to explicitly view which such random value could lead, at a certain point of the protocol, to either a confidentiality leak or a masquerade (authentication attack).For instance, in denial of service analysis, we commonly need to verify whether a fake message sent by an intruder can cause the execution of a function requiring a large amount of resources (e.g.decryption or signature verification).In that case, one strategy based on constrained processes would be to verify, for every process following such costly action, the restriction imposed by the formula to the variable representing the fake message: if every fake message satisfies the formula, then we should conclude that the protocol can not detect fake protocol runs.If only few fake messages satisfy the formula, then we should conclude that the protocol is safe since most fake protocol runs initiated by an intruder will have been detected previously.A similar method, based on SPPA, for detecting denial of service vulnerabilities was introduced in a previous paper [14].
Other significant symbolic methods applied to security protocols were proposed by Boreale [3] and Fiore & Abadi [8].Starting from a process algebra similar to spi-calculus, Boreale introduces a symbolic operational semantics based on unification.Boreale then gives a method carrying out trace analysis directly on the symbolic model.Also starting from a process algebra similar to spi-calculus, Fiore & Abadi propose a decision procedure for knowledge checking and a symbolic procedure for knowledge analysis.In future work, we plan to establish more complete relationships between these methods and ours.This task would require introducing constrained processes containing π-calculus and spi-calculus processes, and establishing a symbolic semantics for such constrained processes.

A Decidability of Formulas
Af o r m u l aφ is decidable whenever there is a finite algorithm allowing to verify, for every valuation ̺,w h e t h e r̺ |= φ.But since ̺ |= φ is equivalent to |= ̺(φ), and ̺(φ) is a closed formula, we see that, in order to prove the decidability of every formula from our logic, it is enough to show that every closed formula φ is decidable i.e., to give an algorithm deciding whether |= φ.The first step toward proving this result consists in showing that every closed formula is equivalent to a quantifier-free (closed) formula.
Therefore, given a closed formula φ, we may always assume that φ ∈F;o t h e rwise, an equivalent formula φ ′ ∈F can be easily constructed from φ following the steps above.
Given a formula φ ∈F, we can construct an equivalent quantifier-free formula as follows.We proceed by induction on the number of existential quantifiers in φ.The case where φ h a sn oq u a n t i fi e r si st r i v i a l .
Let n ≥ 0 and assume that every formula in F with at most n existential quantifiers is equivalent to some quantifier-free formula in F. Now consider some formula φ ∈F,w h e r e φ ::= First assume that one of the φ i =( x == t)( l e ts a yi =1 ) .I ft = x (i.e.φ 1 =(x == x)), then we may drop φ 1 and assume that φ ::= ∃ x ∃ x1 ...∃ xn (φ 2 ∧ ...∧ φ m ).Otherwise, if x occurs in t,t h enφ is equivalent to 0, hence |= φ,si n ce we do not allow infinite messages.If x does not occur in t,t h e nw es e et h a tφ is equivalent to the formula which has one less quantifier than φ.Moreover, every sub-formula φ i =(t == t ′ ) can be replaced by an equivalent conjunction of equations ( Since the obtained formula belongs to F , the proof is resolved using the induction hypothesis: there is a quantifier-free formula φ ′ ∈F equivalent to the formula above, and therefore equivalent to φ.Furthermore, we see that every equation can be withdrawn from φ by repeating the steps presented above for each variable x ′ j .Now assume that none of the φ i is an equation.Moreover, assume that x occurs only in φ 1 ,...,φ k (for k ≤ m).Since formulas ∃ x I(t), ∃ x N (t)and∃ x K(t) can only be true whenever t = x, none of the other quantified variables x 1 ,...,x m occurs in the predicates φ 1 ,...,φ k (otherwise φ ⇔ 0).The formula φ is therefore equivalent to where φ i ∈{ K (x), I(x), N (x)} (for 1 ≤ i ≤ k).Hence, it is enough to find a quantifier-free formula ψ equivalent to ∃ x (φ 1 ∧...∧φ k ); we take ψ ::= 1 whenever φ i = I(x), for every i =1,...,k;o r φ i = N (x)o rφ i = K(x), for every i =1,...,k.
Otherwise we take ψ ::= 0. Finally, it is straightforward to see that the resulting formula (either 0 or ∃ x1 ...∃ xn (φ k+1 ∧ ... ∧ φ m )) still belongs to F and has at most n quantifiers.Thus, by the induction hypothesis, we can found an equivalent formula φ ′ ∈F, which is also equivalent to φ.
For the next lemma, recall that every formula in F in closed, including the quantifier-free formulas.
Lemma 13.Every quantifier-free formula in F is decidable.
Proof.Let φ ∈F be any quantifier-free formula.If φ = 1 or φ = 0, then the statement is trivial.Now assume that φ = φ 1 ∧ ... ∧ φ n with n ≥ 1.Since φ is closed, every sub-formula φ i is either a predicate I(a), K(a)o rN (a) (every predicate M(a) can be replaced right away with 1), or an equation a == a ′ ,f o r some messages a, a ′ ∈M .But, as we saw in Section 2.2, each predicate I(a), K(a)o rN (a) is assumed to be decidable, and each equation a == a ′ is also decidable by successive reductions.Hence, each sub-formulas φ i may therefore be individually replaced by either 1 or 0. Any such conjunction of 1 and 0 is clearly decidable.

⊓ ⊔
The proof of Theorem 1 follows from Lemma 13 and the fact that every closed formula is equivalent to a formula from F.

B Operational Semantics of SPPA.
The operational semantics of SPPA is given in Fig. 7 and Fig. 8.It is a valuepassing-based semantics defined only for closed processes i.e., processes P such that fv(P )=∅.R u l e sSum, Parallel, Protocol and Synchronisation are assumed to be associative and commutative.
A process P ′ is a derivative of P if there is a computation P γ −→ P ′ for some γ ∈ Act * .We also consider the set of P 's derivatives defined as follows:

C Proof of the Finiteness of Symbolic Semantics
In this section, we show that, for any constrained process P, φ ,t h et r a n s ition graph associated to P, φ using SPPA's operational symbolic semantics is always finite.But in order to obtain this result, we first need to establish the following restriction on SPPA's syntax (often applied on other process algebras for similar purposes): we do not allow recursive definitions P := P 1 \ L, P := P 1 /O, P := P 1 |P 2 ,o rP := P 1 P 2 such that P occurs somewhere within either P 1 's or P 2 's definition.Hence, we assume that any recursive definition of some SPPA's agent or process P (i.e.where P is defined using a self reference P ) never uses a restriction operator, nor an observation operator, nor a parallel composition operator, nor a protocol operator.Such recursive definitions often lead to "infinite" processes, i.e.SPPA processes with infinitely many derivatives.For instance, processes P ::= (c(x).P ) \ L and P ::= P |P ′ are refrained, while processes P := let x = f (t) in P and P := c(x).P + P ′ are retained.
We say that the process P ′ is a sub-process of the constrained process P, φ whenever there is some formula φ ′ such that P ′ ,φ ′ ∈D( P, φ ).
Proof.The proof follows from the fact that SPPA's symbolic semantics rules (Fig. 1 and Fig. 2) never alter the initial definition of P (and its sub-processes), neither through substitution P ′ [t/x] or variable renaming.Hence, any sub-process P ′ occurring in D( P, φ ) must be syntactically identical to its initial definition within P .We may therefore conclude that the cardinality of the set {P ′ | P ′ ,φ ′ ∈D( P, φ )f o rs o m eφ ′ } is at most N u +2N b ,w h e r eN u is the number of unary SPPA operators (output, input, function call, match, restriction, etc.) used in the syntactical definition of P ,andN b is the number of binary operators (sum, parallel composition, etc.) used in the syntactical definition of P .

⊓ ⊔
It follows from Lemma 14 that, for any constrained process P, φ ,t h e r e are only finitely many variables occurring in P and its sub-processes.Moreover, these variables are exactly the ones used within P 's syntactical definition.Let {x 1 ,...,x n } be the finite set containing those variables.We may also conclude from observing the semantics rules that fv(φ ′ ) ⊆{ x 1 ,...,x n } for any formula φ ′ ∈ Φ,w h e r eΦ = {φ ′ | P ′ ,φ ′ ∈D( P, φ )f o rs o m eP ′ }.Thus, any variable is finite.Indeed, we see from the semantics rules that the existence of a transition P, φ α −→ Q, ψ , along with the value of the action α, depends only on the process P and not on the formula φ (as long it is not equivalent to 0).Moreover, since P, φ has only a finite number of sub-processes (by Lemma 14), the total number of actions α occurring within P, φ 's semantics must be finite.
We may therefore conclude that given any two constrained processes, there are finitely many minimal computations between them.Indeed, if N p denotes the number of P 's sub-processes and N a denotes the number of actions occurring within P, φ 's semantics, then the number of minimal computations between any two constrained processes is at most N p ! • N p Na+1 ,w h e r e -N p ! is a bound on the number of possible sequences of sub-processes corresponding to some minimal computation between the two constrained processes (i.e.no sub-process may occur twice), and -N p Na+1 is a bound on the number of possible sequences of actions corresponding to some minimal computation between the two constrained processes.
In particular, there are only finitely many minimal computations between P, φ and P, φ ′ .⊓ ⊔ Proof of Theorem 2. Let P, φ be a constrained process and assume that it has infinitely many derivative i.e., D( P, φ ) is infinite.In that case, and since the number of transitions emanating from some constrained process is always bounded, there is an infinite computation with pairwise different constrained processes (i.e.no constrained process occurs more than once during the computation).Let {x 1 ,...,x n } be the set of variables occurring in the computation (1) (we saw above that this set must be finite).Since every formula ψ occurring in the computation (1) is such that fv(ψ) ⊆ {x 1 ,...,x n }, we may assume that the infinite computation has a tail such that fv(φ 1 )=f v ( ψ k )={x 1 ,...,x n },f o rk ≥ 2.Moreover, since P has finitely many sub-processes (by Lemma 14), we may assume that process P 1 occurs infinitely often in this computation.Hence, we can write with γ k ∈ Act * and fv(φ k )={x 1 ,...,x n },f o rk ≥ 1.We may also assume that each computation P 1 ,φ k γ k −→ P 1 ,φ k+1 is minimal.Moreover, by Lemma 15, there are finitely many minimal computations between any P 1 ,φ k and P 1 ,φ k ′ , thus there are finitely many different sequences of actions γ k .Assume that these possible sequences of actions are γ ′ 1 ,γ ′ 2 ,...,γ ′ m , hence any γ k from the computation (2) is such that γ k ∈{ γ ′ 1 ,γ ′ 2 ,...,γ ′ m }.Furthermore, we can assume that each γ ′ k occurs infinitely often within the computation (2).Otherwise, if one of the sequence of actions γ ′ k occurs only a finite number of times within the infinite computation, then we consider the infinite computation obtained by cutting the computation (2) after the last occurrence of γ ′ k ; this computation contains no γ ′ k .
From the proof of Lemma 15, we know that the fact that a constrained process P ′ ,φ ′ may execute an action depends only on the definition of the process P ′ (as long φ ′ is not equivalent to 0).Hence, given any sequence of action γ ′ k , every computation P 1 ,φ l γ ′ k −→ P 1 ,φ l+1 transforms the formula φ l to the formula φ l+1 following the exact same rule (for any l).Moreover, we see from the symbolic semantics rules that we must have for some formulas ψ k and ψ ′ k (which do not depend on l), and where {x (k)  1 ,...,x (k) n k }⊆{ x 1 ,...,x n }.Also notice that the formula φ l+1 is equivalent to the formula where the y (k) i are new variables.For simplicity, we use the following notation: we write ∃ y Using this notation, we have Now consider the family of formula mappings {Γ k } m k=1 with where the y (k) i are always new variables i.e., any mapping Γ k (for 1 ≤ k ≤ m) never uses the same variables y (k) 1 ,...,y (k) n k twice.From our arguments above, we see that Γ k (φ l ) ⇔ φ l+1 for any l ≥ 1 such that P 1 ,φ l γ ′ k −→ P 1 ,φ l+1 ,a n dt h e sequence of formulas φ 1 ,φ 2 ,φ 3 ,... from the computation (2) can therefore be written as φ 1 ,Γ k1 (φ 1 ),Γ k2 (Γ k1 (φ 1 )),Γ k3 (Γ k2 (Γ k1 (φ 1 ))),... for every l ≥ K (for some K ≥ 1 large enough).Hence, Γ k (θ k ′ ) ⇔ θ k .B u t this contradicts the fact that the formulas φ l (from the computation (2)) are pairwise not equivalent, and therefore contradicts the existence of the infinite computation (2).Hence, the set D( P, φ ) must be finite.

⊓ ⊔
Remark.The formulas θ k from the previous proof are too large to be explicitly written down in this paper, but we can see that they have the following form i ) applied on every formulas ψ k and ψ ′ k (for k =1 ,...,m).The formula θ may be a very large formula (although it is rather small in most practical cases) which turns out to be some sort of fixed point for every mapping Γ k .Indeed, since θ contains every composed substitution applied on every formulas introduced by the mappings, then any new substitution introduced by some Γ k will have no effect on θ.The family of formulas {θ k } m k=1 will therefore be such that Γ k (θ k ′ ) ⇔ θ k , for any k, k ′ .

D Proofs of Lemma 3, Lemma 4 and Lemma 5
Proof of Lemma 3. In order to shorten the proof, the two statements are proved simultaneously by induction on the structure of P .Depending on the statement to prove, we put either The case where P = 0 is trivial.If P = c(t).P ′ ,o rP = c(x).P ′ ,o rP = let x = f (t) in P ′ ,o rP = let (x, y)=t in P ′ ,o rP = case t of {x} t ′ in P ′ , or P = case t of [t ′′ ] t ′ in P ′ , then the conclusion follows from rules Output, Input, Function, Generator, Split, Decryption and Signature-Verif.
If P = P 1 + P 2 ,orP = P 1 |P 2 ,orP = P 1 P 2 (and α is not a marker action), then, from semantics rules Sum, Parallel and Protocol, we may assume that