Contagion Dynamics on Financial Networks ∗

We provide a graph theoretic background for the analysis of ﬁnancial networks and review some technique recently proposed for the extraction of ﬁnancial networks. We develop new measures of network connectivity, that are Von Neumann entropies and disagreement persistence index, using the spectrum of normalized Laplacian and Diplacian. We show that the new measures account for global connectivity patterns given by paths and walks of the network. We apply the new measures to a sequence of inferred pairwise-Granger networks. In the application, we employ the proposed measures for the system immunization and early warning for banking crises


Introduction
Given the relevance of the latest financial and sovereign crises, systemic events are now deeply analysed by scholars and policy makers. As a matter of fact, the studies on the consequences of systemic risk are relevant both for the stability of the financial and banking system and in terms of diversification in an investor perspective [Das and Uppal, 2004]. As in Billio et al. [2012], we define systemic risk "any set of circumstances that threatens the stability of or public confidence in the financial system" where interconnectedness among financial institutions and markets represents a potential channel in propagation of shocks to the system [Billio et al., 2012, Diebold andYilmaz, 2015]. Pairwise Granger causality tests have been used to extract the network of significant linkages among financial institutions and to find which ones are systemically important [Billio et al., 2012]. The question here is to compare the network defined over time (using pairwise Granger causality tests), to identify possible distortions, sources of systemic risk. Recently, entropy measures have been involved in systemic risk measurement for propagation of financial contagion [Paltalidis et al., 2015] and as early warning indicator for banking crises [Billio et al., 2016]. Billio et al. [2016] use the Shannon, Tsallis and Rényi entropies of the degree distribution of financial networks. While these measures revealed effective in predicting banking crisis, they take into account only the dispersion of the degree distribution. In this paper we propose an alternative entropy measure, the Von Neumann entropy, which is specifically designed for networks. The Von Neumann entropy has been widely used for the analysis of complex systems and it naturally arises from the relationship between network structure and density matrix of a state of a quantum system [for definition and references, see Garnerone et al., 2012]. Unfortunately, most of this literature has focused on undirected networks and has used both the adjacency matrix and the Laplacian matrix to obtain the association with a state of a quantum system. In any case, the analogy with a quantum system is not in any way essential to understand the remaining of the paper. Within the few works providing extensions of the Von Neumann entropy for undirected graphs to the case of directed graphs, Ye et al. [2014] certainly represents a relevant reference. The peculiarity of directed networks and in particular their asymmetry requires some care and redefinition of the Laplacian [Chung, 2005]. In particular, the combinatorial Laplacian for directed networks is formulated in terms of nodes out degree, the Perron vector of the transition matrix, that is related to eigenvector centrality, and can be soundly computed only for strongly connected components, and related to the circulation on the graph [Chung, 2005]. In this paper, we build on Ye et al. [2014] and propose an alternative definition of Von Neumann entropy for directed graphs which accounts for positive Laplacian matrix.
We show that the new measures can be written as weighted sums of walks of the network and relate them to consensus dynamics. This chapter is organized as follows. Section 2 introduces a notion of financial network and some background in graph theory. Section 3 reviews some methods of network extraction. Section 4 presents classical network measures and Section 5 discusses our new measures based on the notion of Von Neumann entropy while in Section 5.2, we discuss the role of the Diplacian in consensus dynamics and design an associated measure. Section 7 provides an empirical application.

Financial Networks
A network can be defined as a set of vertices (or nodes) and arcs (or edges) between vertices.
In financial networks, a node represents a financial institution (e.g., a bank, an insurance company, a financial agglomeration) and an edge has the interpretation of financial linkage between two institutions. In mathematical terms a network can be represented through the notion of graph and its properties. In the following sections we provide some background in graph theory useful for a better comprehension of the new indicators developed in this paper and of the analysis of financial networks. For further material on graph theory and random graph we refer the interested reader to Bollobás [1998] and Bollobás [2001]. See Jackson [2008] for an introduction to network theory in social sciences.

Graph Theoretic Foundation
A graph is defined as the ordered pair of sets G = (V, E) where V = {1, . . . , n} is the set of vertices (or nodes) and E ⊂ V × V the set of edges (or arcs). The order of a graph is the number of vertices in V , that is the cardinality of V denoted with |V |. An (directed) edge between two nodes exists if there is a relationship between them and it can be identified as the (ordered) pair {u, v} with u, v ∈ V . If there is no direction in the connection between nodes then an edge {u, v} is an unordered pair of nodes and the graph G is said to be undirected, whereas if a direction exists, then each edge {u, v} is defined as an ordered pair of nodes and the graph G is said to be directed graph (or digraph).
Assume for simplicity the graph G = (V, E) is undirected. If {u, v} ∈ E then u and v are adjacent vertices and they are incident with the edge {u, v}. For each node u, it possible to define its neighborhood as the set of nodes adjacent to u, that is N u = {v ∈ V ; {u, v} ∈ E}. The vertex adjacency structure of a n-order graph G = (V, E) can be represented through a n-dimensional matrix A called adjacency matrix. Each element a uv (a) of the adjacency matrix is equal to 1 if there is an edge from institution u to institution v with u, v ∈ V , and 0 otherwise, where u = v, since self-loops are not allowed. If the graph is undirected than a uv = a vu , that is the adjacency matrix is symmetric.
As an example, Figure 1 includes two graphs, one directed (panel ( In some applications it is useful to focus the analysis on a part of the graph. We The subgraph can be induced by a subset of edges or by a subset of nodes. Panel (c) of Figure 1 shows, as an example, the subgraph of the directed graph reported in Panel (a). Given two subgraphs

Graph Connectivity
The two extreme configurations of the connectivity structure of a n-order graph G are given by the graph with empty edge set, i.e. |E| = 0, which is called empty graph and denoted with E n and the complete graph where each node is adjacent to all other nodes in the graph. In this case, the cardinality of the edge set is maximal, i.e. |E| = n(n − 1)/2, and graph is denoted with K n . Panels (a) and (b) of Figure 2 show an example of complete, K 4 , and empty, E 4 , graphs.
In the connectivity structure of a graph and in spreading of contagion in a network the cohesiveness and the indirect connections between nodes play a crucial role. The cohesiveness can be represented through the number and size of cliques or of communities.
The notion of indirect connection can be made more precise through the definitions of walk, trail, path, circuit and cycle.
A clique C ⊂ G, is defined as an ordered pair of sets C = (V C , E C ) with V C ⊂ V and . . , e l , v l ) between two vertices u and v of G, called endvertices, is identified by an alternating sequence of (not necessary distinct) vertices and v 0 = u and v l = v. The number of edges |E(W uv )| = l in a walk is called "walk length". A walk of length l is called l-walk and denoted with W l . It is easy to show that the number of l-walks from node u to node v is equal to the (u, v)-th element of A l that is equal to If all edges are distinct then the walk is called a trail. A trail with coincident endvertices is called a circuit (or closed trail). A walk W l with l ≥ 3 with v 0 = v l and vertices v j , 0 < j < l distinct from each other and from v 0 , is called cycle and denoted with C l . An example of cycle C 4 is given in panel (d) of Figure 2.
A path P uv between vertices u and v of G is a walk with distinct elements in its vertex set. A generic path of length l is denoted with l. The shortest-path P * uv between two vertices u and v is min l {P uv = (v 0 , e 1 , . . . , e l , v l ), l ≥ 1} that is the path with the minimum length. An example of path P 2 is given in panel (c) of Figure 2. The notion of shortest path is relevant in spreading of contagion in financial networks. The average shortest path over pairs of nodes reflects the time a shock takes to spread out in the network. The lower the average shortest path is the higher will be speed of shocks transmission. Moreover, if a function of the losses is assigned as weight to the edges between nodes, then the shortest Figure 2: Example of complete graph K 4 (a), empty graph E 4 (b), path P 2 (c) and cycle C 4 (d).
path can be used to provide a measures of the minimum loss following a transmission of shocks in the financial networks.
The notion of path allows us to introduce the definition of connected graph and some other basic graph structures. A graph is connected if for every pair of distinct vertices u and v there is a path from u to v. A maximal connected subgraph is a component of the graph. A cutvertex is a vertex whose deletion increases the number of components.
An edge is a bridge if its deletion increases the number of components. A graph without cycles is a forest or an acyclic graph. A tree is a connected forest.
Also, the notion of path can be used to define the distance between two nodes u and v, d(u, v) as the length of the shortest path or geodesic between u and v. The notion of distance allows us to define the diameter of G, diam(G) as the max u,v∈V d(u, v) and the radius of G, rad(G), as the min u max v d (u, v). If the graph G is connected then there exists and integer l > 0 for which the (u, v)-th element of A l is not equal to 0 for each pair (u, v) and the lowest integer l * such that A l * is not equal to 0 for each pair of nodes is diam(G). Thus the diameter is equal to the length of the longest shortest path in G.
Finally, note that it is possible to define graph measures using cycles rather than paths.
The girth of a graph G, gir(G), is the length of the shortest cycle in a network (set to infinity if there are no cycles) and the circumference, circ(G), is the length of the largest cycle.

Network Extraction
Financial linkages among all the institutions in the system are commonly unobservable, but they can be inferred from data by applying suitable econometric tools. In the following, we review some techniques used to extract financial networks from a panel of observations which contains, for each institution i, i = 1, . . . , n, the time series of a variable of interest, y it , t = 1, . . . , T , such as financial returns, or realized volatility. Billio et al. [2012] proposed pairwise Granger causality between returns to extract the network of financial institutions. The adjacency matrix A is estimated using a Granger causality test on pairs of time series to detect the direction and propagation of the shocks among the two institutions considered. In the pairwise-Granger approach to network extraction, the following bivariate vector autoregressive model (VAR) of the order one is

Pairwise and Conditional Granger Networks
∀i, j = 1, . . . , n, i = j, where ε it and ε jt are uncorrelated white noise processes. Then, a test for the existence of Granger causality is applied. The definition of causality implies, for t = 1, . . . , T , • if ϕ 12 = 0 and ϕ 21 = 0, y jt causes y it and a ji = 1; • if ϕ 12 = 0 and ϕ 21 = 0, y it causes y jt and a ij = 1; • if ϕ 12 = 0 and ϕ 21 = 0, there is a feedback relationship among y it and y jt and a ij = a ji = 1, where a i,j is the element in i − th row and j − th column of the adjacency matrix A.
The standard pairwise Granger causality approach deals only with bivariate time series and does not consider the conditioning on relevant covariates. In order to account for spurious causality effects, the model given above can be extended to include exogenous variables and endogenous variables at higher order lags. The maximum lag can be selected according to some criteria, such as AIC or BIC. The conditional Granger approach considers the conditioning on relevant covariates, however, with a higher number of variables relative to the number of data point, it encounters problems of over-parameterization, that leads to a loss of degrees of freedom and to inefficiency in correctly gauging the causal relationships. Ahelegbey et al. [2016a,b] proposed an alternative approach for network estimation based on graphical models. In their approach, the following structural VAR (SVAR) model is estimated,

Granger Networks and Graphical Models
where ε it , i = 1, . . . , n, are uncorrelated white noise processes, ϕ ij are the autoregressive coefficients of the lagged dependence structure and γ ij the structural coefficients of the contemporaneous dependence structure. In this approach two kind of networks are extracted: the undirected acyclic graph G 0 for the contemporaneous dependence and the directed graph G for the lagged dependence structure. To this aim, the coefficients are re-parametrized as γ ij = a 0,ij γ * ij where a 0,ij ∈ {0, 1} is the element of the adjacency matrix of G 0 and ϕ ij = a ij ϕ * ij where a ij is an element of the adjacency matrix of G. Then the binary connectivity variables are estimated by applying a Markov-chain Monte Carlo search algorithm. For the contemporaneous dependence graph G 0 an acyclic constraint is used to identify the causal directions in the system and to produce an identifiable SVAR model.
• if γ * 21 y it = 0 and γ * 12 y jt = 0, y jt is tail dependent from y it and a ij = 1, but not vice versa, a ji = 0; • if γ * 21 y it = 0 and γ * 12 y jt = 0, y it is tail dependent from y jt and a ji = 1, but not vice versa, a ij = 0; • if γ * 21 y it = 0 and γ * 12 y jt = 0, y it is a tail mutual independence among y it and y jt and a ji = a ij = 0; The relationship between the financial institutions using the quantile regression is asymmetric which implies the network extracted can be represented as a directed graph with an asymmetric adjacency matrix. The approach given above can be extended to include more lags and further covariates to control for spurious linkages.

Classical Network Measures
In this section, we present the commonly used measures in network analysis. See also Newman [2010] for a review. The structure and connectivity features of a network can be characterized by means of some measures. Node-specific measures are evaluated at the node level and reveal the role of a node in the connectivity structure of the network and its relationship with the other nodes. Local measures can be used to identify systemically important financial institutions. Global measures aim to describe the connectivity structure or topological features of the network, and therefore can be used to analyse the stability and fragility of the financial system.

Node-specific Measures (i.e local measures)
In the undirected network, the degree indicates the number of adjacent nodes, that is the number of nodes to which the node is connected. If a uv is the u-th row and v-th column element of the adjacency matrix A, then the degree is equal to In directed graphs, for a given node i it is useful to define the number of edges directed from othr nodes to node i (in-degree) and from node i to other nodes (out-degree), and the total number of incident edges (total degree), that are The measures d out u and d in u are also known as in-degree and out-degree centrality measures and assess the centrality of a node in the network. They can be used to identify which are the nodes in the network spreading risk (spreaders) and which absorbing it (receivers).
While the degree measures how connected a node is, closeness account for connectivity patterns (such as paths and cycles) and indicates how easily a node can reach other nodes.
The closeness centrality of a node u is defined as where l(u, v) is the length of the shortest path between u and v.
A measure related to the closeness is the betweenness centrality which indicates how relevant is a node in terms of connecting other nodes in the graph. Let n(u, v) be the number of shortest paths P * uv from u to v, and n w (u, v) = |{P * uv ; w ∈ P * uv }|, i.e. the number of shortest paths from u to v going through the node w, then the betweenness Bonacich [1987] introduced a measure of centrality for a undirected graph 1 , called eigenvector centrality, which accounts for the centrality of the neighbourhood of a given node. This measure describe the influence of a node in a network and is defined as where the score x u is related by the score of its neighbourhood N u = {v ∈ V ; a uv = 1}.
It is easy to show that the score vector x = (x 1 , . . . , x n ) satisfies the equation Ax = λx, where λ is an eigenvalue of the matrix A. Eigenvector centrality explains the propagation of economic shocks better than other measure as closeness and betweenness centrality since it accounts not only for the number of connections of each node with the adjacent nodes, but also for its weight and for the weights of the paths connecting the node to the other nodes of the graph.
Bonacich [2007] introduced a related centrality measure, the c(β) centrality, defined as with |β| < 1/λ 1 where longer paths are weighted less through higher powers of the discount parameter β. ι is the n-dimensional unit vector and λ 1 is the largest eigenvalue. This measure is a weighted sum over all possible paths connecting other vertices to each position.
The c(β) centrality is strictly related to another widely used measure, that is the Katz centrality. Katz [1953] proposed a node centrality measure which is a weighted sum of the walks of a given node neighbours, with weights driven by an attenuation parameter. The Katz centrality of the node u is defined as where 0 < β < 1 is the attenuation parameter and α is an arbitrary term that avoids to consider in the centrality score all vertices with null degree. The parameter α is usually set to 1. Equation 13 can be written in matrix form: x = (I n − βA) −1 αι = (I n + βA + . .)αι which shows how neighbours, and neighbours of neighbours affect the nodes centrality. Finally, note that the Katz centrality has as special cases the c(β) centrality for α = 1 and the Bonacich's centrality for α = 0 and β = 1/λ 1 .

Global Measures
The density of a n-order graph is given by the ratio between the number of edges in the edge set E, denoted with e(G), and the number of edges of the complete graph K n . If the K n graph is undirected the cardinality of the edge set is e(K n ) = n(n − 1)/2, if it is directed the cardinality is e(K n ) = n(n − 1). Thus the graph density is The density is null if G is the empty graph E n and it is equal to one if G is the complete graph K n . Density is a good indicator of the level of interconnectedness in a financial network. Nevertheless this measure relies on the adjacency of the nodes and does not consider indirect connectivity patterns such as paths and cycles.
A way to account for connectivity patterns is to analyse the cliquishness of a graph.
Unfortunately, the clique structure can be very sensitive to slight changes in the graph and thus, a general procedure for extracting cliques can fail in finding the clique structure.
A common way to measure cliquishness is to employ the average clustering coefficient of a n-order graph G = (V, E), Cl(G), which counts the fraction of fully connected triples of nodes out of the potential triples where at least two links are presents. In formulas, we have where where N u (G) indicates the neighbourhood of a node u in G, and so if G is the complete graph K n , N u (K n ) is of the set all the nodes different from u.
Assortativity can be defined as the difference between the number of edges among vertexes having the same characteristics and therefore belonging to the same class, and the expected number of edges among these vertexes if the attachment were purely random [Newman, 2002[Newman, , 2003. Let m i be the class of the vertex i, and n m the number of classes in the network. In a directed network, the number of edges among the vertexes of the same class is where δ xy is the Kronecker delta. 2 Assuming a random graph, the expected number of edges among vertices of the same class is equal to where d out u and d in v denote the nodes out-and in-degree. Thus, the assortativity measure of the graph G = (V, E) can be defined as The maximum assortativity value Q max is attained when all edges of E are adjacent to all vertices of the same category, so If 0 < Q ≤ Q max the nodes have an homophily behaviour and if Q/Q max is close to zero there is any preferential attachment of the nodes and the network is a random graph. If −Q max ≤ Q < 0 the nodes exhibit disassortative patterns, in sense that it is likely that a given node is connected to nodes in a different class.
The assortativity measure is very important in systemic risk analysis because it allows to detect clusters in the network that can be useful to block the shocks propagation in some circumstances as a firewall. The assortativity can be also applied to the vertex degree in order to capture the tendency of each node to connect with vertices having similar or different degree. This measure is able to detect a core-periphery structure of the graph when the assortativity coefficient is high. Let e jk be the edges fraction connecting vertices of degree j to vertexes of degree k, q out j and q in j the probabilities to have an excess indegree, out-degree, 3 respectively, equal to j and σ out q and σ in q standard deviation of the degree distributions of q out j and q in j , respectively. Then, following Newman [2003], the assortativity by degree for directed graphs is defined as Assortativity can be similarliy defined for undirected graph [see Newman, 2003].
3 As calculated in Equation 6 above.

Von Neumann Entropy for Directed Graph
Following Ye et al. [2014], we introduce a Markov chain process on the graph with transition matrix P with entries . . , d out n )} then P can be in written in the matrix form: P = D −1 A. Additionally, given the vector ϕ of the ergodic probability of the Markov chain associated to P , 4 we define the diagonal matrix ϕ = diag{ϕ} and the Laplacian [Chung, 2005] for undirected graphs: where I n is the identity matrix. L 1 can be related to a random walk onto the graph starting at a given node chosen with uniform probability while L 2 to a random walk starting at a node chosen according to the ergodic probability vector ϕ. In fact, those Laplacians can be interpreted as the Laplacians for an equivalent weighted undirected graph obtained by changing the weights of the starting graph but not its connectivity [Boley et al., 2011].
The difference among the two is in the equivalent ergodic probabilities that are uniform in the L 1 and again equal to ϕ for L 2 . In this line, L 1 is largely driven by long-run effects with respect to L 2 .
The induced L i by the directed graph is symmetric but not positive definite and consequently, it is not suitable to be used as a proper density matrix [Braunstein et al., 2006].
Using the results in Braunstein et al. [2006] and Garnerone et al. [2012], we obtain a density matrix ρ, based on the Laplacian but corrected for directed graphs. A proper density matrix, ρ, is a symmetric positive definite matrix with unitary trace. The functional form of the density matrix describes the correspondence of the graph to a given quantum system. In Passerini and Severini [2009] for undirected graphs, the density matrix is defined as the Laplacian normalized by its trace while Ye et al. [2014] generalizes the same construction for directed graphs, Formally, any positivity preserving transformation of the Laplacian could provide a proper density matrix ρ. In this regard, we consider the exponential function as an alternative transformation, .
The linear transformation takes into consideration only the 1-step walk probability which can be viewed as the short run effect of propagation in the network. Differently, the exponential considers all the possible walk probabilities which are weighted by the inverse of the factorial of the walk length. Thus, it can be interpreted as the long run effect of propagation. According to the random walk interpretation of the two Laplacians, we do not expect large differences among ρ(L 1 ) and ρ(E 1 ). Given the density matrix ρ, we can measure the complexity of the network using the Von Neumann entropy, which is equivalent to the Shannon entropy of the eigenvalues of ρ, and is bounded [Passerini and Severini, 2009], For undirected graphs and ρ Li the maximum entropy is associated with the complete graph [Passerini and Severini, 2009]. For directed graphs, linear density matrix and the Laplacians L 1 , Ye et al. [2014] show that the maximum value of entropy is associated with the star graph according to the quadratic approximation.

A Von Neumann Entropy Decomposition
We present some preliminary results which will be used to state some properties of the Von Neumann entropy proposed in this paper. Let L be one the two Laplacian matrices L i , i = 1, 2, then the following results hold.
Theorem 1 The following property holds for the Laplacian L where i 1 = i and i q+1 = j, q = 0, 1, . . . , n − 1 the length of the walk and d in ir is defined in Equation 8.
Proof. See Appendix A.
In the theorem given above the matrices W (q) have an interpretation in terms of path of random walk on a network. More specifically, for L = L 1 W (q) ij is the transition q steps forward in time starting in i at time 1 and arriving in j at time (q + 1). Whereas for ij is the transition q steps forward and q steps backward in time, starting in i at time 0 and arriving in j at time 2q.
We show that the Von Neumann entropy based on the transformed Laplacian accounts for various features of the associated graph. We focus on the numerator of Eq. 24 where m! is factorial. It can be approximated bỹ with M < ∞. Then, the Von Neumann entropy S given in equation 25 can be approximated as Proof. See Appendix A.
Theorem 2 The approximated quadratic Von Neumann entropy for various degrees of approximation can be written as where R (lk) and R (k) have been defined in Theorem 1.
Note that our quadratic entropy for the case M = 1 has an analytical relationship with the quadratic entropy S Q = n −1 tr(L 1 ) − n −2 tr(L 1 ) of Ye et al. [2014] as stated in the following.
Corollary 1 Proof. See Appendix A.
6 Diplacian and convergence rate to distributed consensus In the computation of the Von Neumman entropy we considered a symmetrized version of the Laplacian matrix for directed graphs. This choice, as pointed out in two recent papers, could hamper the comprehension of several important aspect related to diffusion on the graph [Boley et al., 2011, Li andZhang, 2012]. In particular, we investigate the relationship between the eigenvalues of the diplacian introduced in Li and Zhang [2012].
The Diplacian is defined as Γ = ϕ 1/2 (I − P ) ϕ −1/2 and the rate of convergence of autonomous agents on the network to a consensus [Olfati-Saber et al., 2007]. The application of those techniques to financial networks could be understood as measuring the structural speed of coordination so that a low rate implies persistence of disagreement in the market that in line with Carlin et al. [2014] is "magnified when major events occur in financial markets". Our approach considers a limited communication network [see Parikh and Krasucki, 1990] among agents, proxied by the causality relationship between those stocks returns. The convergence to a final group decision is then a convergence to the consensus of investors, trading in different stocks, on common Arrow-Debreu securities prices. Consequently, we propose the persistence of disagreement as a general proxy for the presence of market frictions. Consider a graph with adjacency matrix A and elements a ij and with out-degree diagonal matrix D with non zero elements d out i . We investigate the following discrete time multi-agent dynamical system on the network: Similar systems, pioneered by DeGroot [1974], are considered in models of belief evolution of bounded rational agents, with the bounded rationality motivated by a persuasion bias [DeMarzo et al., 2003, Golub andJackson, 2010]. The closer analogue to our approach is the one considered in the Theorem 2 of Olfati-Saber et al. [2007] and can be rewritten in vectorial form as [2007], being sure that the system converge to a consensus with group decision value ϕ x 0 .
The group decision is a conserved quantity of the dynamics: Consequently, we can define a disagreement vector and its dynamics The disagreement dynamics allows us to study speed of convergence to this decision value.
We exploit the theoretical results on lazy random walks on strongly connected directed graphs due to Chung [2005] and Li and Zhang [2012]. In particular in Li and Zhang [2012] the decomposition of the Diplacian Γ in its symmetric and asymmetric part is introduced In the following theorem the convergence rate is expressed in terms of λ 2 the second smallest eigenvalue of L and of the second largest singular value σ n−1 (I n − L) of I n − L and the largest singular value σ n (∆) of the skew-symmetric part of the diplacian ∆.
Theorem 3 Consider the discrete-time system introduced in (34) on a a strongly connected directed network. A consensus is globally exponentially reached according to where µ is the disagreement persistence index, measuring the convergence rate to consensus.

Proof. See Appendix A.
Li and Zhang [2012] show that for symmetric adjacency matrices, i.e. for undirected graphs,σ n (∆) = 0 and they propose it as a measure of asymmetry (directedness). Moreover they underline that in this case µ = 1 − λ 2 2 a bound previously derived in Chung [2005].

The expression in Theorem 3 implies a slower convergence if the graph is directed and
shows an initial magnifing effect of the heterogeneity of importance of the nodes in the group decision. The latter effect is not present for directed balanced graphs i.e. for directed graphs that have row and column sum equal, because in that case the group decision is the average of the initial state vector. Going further,if we assume an initial disagreement vector of unitary norm we can evaluate the time needed to reach consensus. In fact, for In the empirical result section we will evaluate µ on the giant strongly connected component and multiply it by a weight proportional to the size of the component in order to take into consideration the impact of the number of coordinated agents.

Data Description
The dataset is composed by the daily closing price series for the European financial institutions (active and dead) from 29 th December 1999 to 16 th January 2013. We analyse a total of 437 European financial institutions according to the Industrial Classification Benchmark (ICB). We select the MSCI Europe index as the proxy for the market index, which covers the 15 European countries where the financial institutions are based.
To estimate dynamic Granger networks, we use a rolling window approach [e.g., see Billio et al., 2012, Diebold and Yılmaz, 2014, Zivot and Wang, 2003] with a window size of 252 daily observations, that is approximatively one year. 5 The sequence of adjacency matrices of the directed graph extracted with the pairwise Granger approach is represented in Fig. 3 (a weekly sampling frequency has been used for expository purposes). Red boxes highlight the adjacency matrix at a given date. In each box blue dots represent directed edges between nodes.
The question here is to compare the network defined over time, to identify possible distortions and sources of systemic risk. Figure      , radius (Ra), average path length (APL), number of weakly and strongly connected components (WCC and SCC), average clustering coefficient (ACluCoe), average closeness centrality (ACloCen), average betweenness centrality (ABetCen), average eigenvector centrality (AEigCen), the Von Neumann entropies for the linear, i.e. S(ρ L 1 ) and S(ρ L 2 ), and exponential, i.e. S(ρ E 1 ) and S(ρ E 2 ), functional forms using the two Laplacians L 1 and L 2 and the disagreement persistence index (µ). Table 1 shows the network measures for the four networks depicted in Figure 5. Panels For expository purposes, nodes with a degree lower than 70 (a), 90 (b), 300 (c) and 350 (d) have been removed. The difference in threshold is due to different network density. The node color from black to red reflects the node eigenvector centrality. Red indicates a large centrality level, black a low centrality.

and (d). The decrease in the diameter indicates that a financial shock propagates faster in (c) and (d) than in (a) and (b). This is supported also by the average path length (APL)
which is larger in (a) and (b) with respect to (c) and (d) (see Table 1. The same applies to the average cluster coefficient (ACluCoe), average closeness centrality (ACloCen) and average betweenness centrality (ABetCen). Finally, the increase of the (log) Von Neumann entropies and the disagreement persistence index (µ) in (c) and (d) with respect than to (a) and (b) indicates a more complex topology of the network. Then, the index decreases with some short spikes before and immediately after the ECB's asset purchase programm.

Early Warning Indicators
As in Billio et al. [2016], we observe a persistence in the dynamics of entropy and thus we test the ability of the Von Neumann entropy and the disagreement persistence index as early warning indicators in nowcasting banking crises. An early warning system issues a signal in case the likelihood of a crisis crosses a specified threshold [Demirgüc-Kunt and Detragiache, 1999]. In this regard, we select as the European banking crisis indicator the one presented in Babeckỳ et al. [2014] and Alessi and Detken [2014a] which represents one of the target variables monitored by European Systemic Risk Board (ESRB). The indicator identifies significant signs of financial distress in the banking system as evidenced bank runs in relevant institutions or losses in the banking system (nonperforming loans above 20% or bank closures of at least 20% of banking system assets), or significant public intervention in response to or to avoid the realisation of losses in the banking system. As stated in Billio et al. [2016], we define an indicator on European basis to be used in the early warning system since the crisis indicator is given on a per-country basis, 1 if more than one country is in crisis at time t 0 otherwise.
The banking crisis indicator in Alessi and Detken [2014a] has its last record in December 2012 and thus we consider the period from is from October 2000 to December 2012.
Since the crisis indicator is at a quarterly frequency and networks are extracted from daily returns, we assume that the daily crisis indicator is equal to 1 for all days in a given quarter, if the quarterly indicator equals 1 for that quarter [see Billio et al., 2016]. Consequently, we make use of the logistic model with Von Neumann entropies and the disagreement persistence index as covariates. As comparison, we include the DCI [Billio et al., 2012], that is the density of the network (De) defined in Equation 14, and the Shannon entropy (H) of the degree distribution as proposed in Billio et al. [2016].
Hence, we set the following logistic regression model,

EWI evaluation and loss function
In the previous section we compare the models by applying some goodness of fit statistical measures. In practice an economic comparison is needed, especially when policymakers are interested using early warning signals to detect vulnerabilities in the financial system according to the indicators of interest (i.e. banking crises). For detecting crisis events using information from indicator C t , we useΦ t = Φ(β 0 +β 1 S t ), the predicted probability of crisis returned by the logit model. Then, the predicted probability is turned into a binary prediction, which takes the value of 1 ifΦ t exceeds a specified threshold c and 0     Billio et al. [2012]; 2) Shannon entropy on In-Out degree (H) as in Billio et al. [2016]; 3) S(ρ L 1 ) indicates the Von Neumann entropy with the linear functional form and Laplacian L 1 ; 4) S(ρ L 2 ) indicates the Von Neumann entropy with the linear functional form and Laplacian L 2 ; 5) S(ρ E 1 ) indicates the Von Neumann entropy with the exponential functional form and Laplacian L 1 ; 6) S(ρ E 2 ) indicates the Von Neumann entropy with the exponential functional form and Laplacian L 2 and 7) µ indicates the disagreement persistence. Significance level: 1% (***). Standard errors in parentheses. The number of observations are 3187.
We set the threshold c = 0.50. 6 . The key aspect of these models is the forecast evaluation and therefore the quality of the issued signals. In fact, policymakers aim to distinguish for two type of potential errors: the missed crisis (type 1 error) and the false signal of crisis (type 2 error). Therefore, the pairs of values of the actual and predicted crisis (C t ,Ĉ t ) can form four possible combinations: both equal to 1 or 0, or different. We can represent those values into a contingency matrix [see Alessi and Detken, 2014b, Duca and Peltonen, 2013, Holopainen and Sarlin, 2016, Sarlin, 2013 which describes this relationship as reported in Figure 8.
The type 1 error (T 1 ) represents the share of missed crises on the total crises, while the type 2 error (T 2 ) represents the share of issued false alarms on the total tranquil periods, F P T N + F P . Crisis Not crisis Signal issued Correct signal (true positive, TP) The type 1 error (T 1 ) represents the share of missed crises on the total crises, F N/(T P + F N ), while the type 2 error (T 2 ) represents the share of issued false alarms on the total tranquil periods, F P/(T N + F P ). The percentage of correctly predicted indicators is the correct signal (TP) and no signal (TN) over the possible situations.
The percentage of correctly predicted indicators is the correct signal (TP) and correct no signal (TN) over the total realizations. We evaluate the impact of the two type of errors using a loss function. In this regard, we adopt the one proposed by Alessi and Detken [2014b] L which is simple to implement and robust to small perturbations, where θ is the relative risk aversion parameter of the decision maker among type 1 and type 2 errors. If θ > 0.5, the aversion is greater for missing a crisis (false negative) than a false alarm (false positive). As suggested by Alessi and Detken [2014b], we set θ equal to 0.5 since it is uncommon to have preferences for values above this value among the financial stability community. If the binary predictor performs perfectly the value of the loss function is 0. Conversely, the worst it performs the more is close to 1. Other robust methods such as the ROC curve can be used to evaluate the TP and TN by varying the threshold [e.g, see Drehmann and Juselius, 2014].  Table 3: Percentage of correctly predicted banking crises with the logit models using: a) Dynamic causality index (DCI) as proposed in Billio et al. [2012]; b) Shannon entropy on In-Out degree as in Billio et al. [2016]; c) S(ρ L 1 ) indicates the Von Neumann entropy with the linear functional form and Laplacian L 1 ; d) S(ρ L 2 ) indicates the Von Neumann entropy with the linear functional form and Laplacian L 2 ; e) S(ρ E 1 ) indicates the Von Neumann entropy with the exponential functional form and Laplacian L 1 ; f) S(ρ E 2 ) indicates the Von Neumann entropy with the exponential functional form and Laplacian E 2 and g) µ indicates the disagreement persistence index. Banking crisis indicators are defined as more than one countries on crisis. The data are from Alessi and Detken [2014a]. T 1 and T 2 represent the share of missed crises on the total crises (F N/(T P + F N )) and the share of issued false alarms on the total tranquil periods (F P/(T N + F P )), respectively. The percentage of correctly predicted indicators is the correct signal (TP) and no signal (TN) over the possible situations. L(θ) is the loss function and θ represents the relative risk aversion parameter of the decision maker among type 1 and type 2 errors.
goodness of the Von Neumann entropies and the disagreement persistence index with respect to the DCI and the Shannon entropy on the In-Out degree. In particular, the Von Neumann entropy with the exponential functional form and Laplacian L 2 , S(ρ E 2 ), shows the lowest value in the loss function (0.19) with the highest true positive (TP) and the lowest false negative (FN) values. Moreover, the Von Neumann entropy with the exponential functional form and the Laplacian L 2 shows the lowest type 2 errors (T 2 ) while the exponential one with Laplacian L 1 shows the lowest type 1 errors (T 1 ). Since S(ρ E 2 ) has a higher T 1 but a lower T 2 with respect to the S(ρ E 1 ), if we set a lower value for θ, the loss function penalizes more T 2 errors and consequently S(ρ E 1 ). By selecting θ = 0.4, both the Von Neumann entropies with the exponential functional form perform equally well, while for values below or equal to 0.3 the entropy with the Laplacian L 1 performs better.

Conclusion
In this paper, we provide a graph theoretic background for the analysis of financial networks and review some technique recently proposed for the extraction of financial networks.
We develop new measures of network connectivity based on the notion of Von Neumann entropy and show that they account for global connectivity patterns given by paths and walks of the network. We apply the new measures to a sequence of inferred pairwise-Granger networks. In the application, we show how to use the proposed measures to achieve effective immunization of the financial system from the spread of contagion. Finally, we show that entropy measures can be successfully employed to generate early warning signals for banking crises.
To show the results in Theorem 3, we note that So the disagreement vector satisfies, at any time, the constraint imposed on the maximization of the quotient in Lemma 1.
Let V (t) = ξ t ϕξ t a candidate Lyapunov function for the system (34). It is a valid candidate because it is well known that the Perron vector ϕ is strictly positive and so V (t) = 0 if and only if ξ t is equal to the zero vector. Then a bound on V (t + 1) is easily obtained using Lemma 1: since µ < 1 the system is asymptotically stable. Accordingly we have min (ϕ) ξ t ξ t ≤ ξ t ϕξ t ≤ µ t ξ 0 ϕξ 0 ≤ max (ϕ) µ t ξ 0 ξ 0 from wich the result follow.