Topological approximate Bayesian computation for parameter inference of an angiogenesis model


	3 Materials and methods



	3.1 Topological data analysis

We illustrate the TDA pipeline starting from input data, homology, interpretation and visualization through to topological statistics inFigure 1.
To characterize the k-dimensional features of a topological space X we can consider the homology group in dimension k, Hk(X), composed of elements that intuitively correspond to equivalence classes of cycles that can be continuously deformed into one another on X. In dimension one, the generators of the homology group correspond to 1D holes in X, or loops, while in dimension zero the generators of the homology group correspond to the connected components of X.
The topological spaces we are interested in can be represented using finite sets of simplices known as simplicial complexes K that are constructed by joining together individual simplices, potentially of different dimensions, and are closed under the operation of taking faces. A 0D simplex corresponds to a single vertex, a 1D simplex an edge, and a 2D simplex a triangle. Given a real valued function on K, we can define a filtration as a sequence of homology groups in a given dimension k, with homomorphisms induced by inclusion
(1)0=Hk(Ka0)→Hk(Ka1)→…→Hk(Kan)=Hk(K)where Ka=f−1(−∞,a] and a0<a1<…<an, and Kai⊆Kaj for i < j. Persistent homology then tracks the birth and death of elements of the homology groups as a varies. By choosing an appropriate definition of the simplicial complex and filtration built from the data, persistent homology can provide information about the topological features in data.
We build the simplicial complex and filtration from the final timepoint of model simulation data followingNardini et al. (2021). All cells in the 2D square lattice that have vasculature present are assigned a value of one, and zero elsewhere. The centroid of each non-zero cell is a 0-simplex. The simplicial complex is built on these 0-simplices based on so-called Moore neighbourhoods: if any of the eight cells surrounding a vertex are also non-zero, then we connect them via 1-simplices (edges) for two points pairwise connected, or 2-simplices for three points pairwise connected by an edge. The union of these simplices form a simplicial complex. There are different ways to study vascular data at multiple scales using filtrations (Bendich et al., 2016;Stolz et al., 2020). Here, we construct sequences of filtered simplicial complexes using a sweeping plane filtration (Bendich et al., 2016;Nardini et al., 2021). In the sweeping plane filtration, we move a vertical line from left to right across the 2D lattice domain and include simplices in the filtration only to the left of this line. This filtration can be considered a sublevel set filtration corresponding to a height function h:X→R on this simplicial complex.


	3.2 Approximate Bayesian computation

In Bayesian inference, we aim to derive the posterior distribution of the parameters of a model given some observed data. To do so we first define a prior distribution on the model parameters, treating them as random variables. This describes our belief in the distribution of the parameters before having observed any data. We then perform a so-called Bayesian update of the model having observed some data. This is done using the likelihood of the observed data given the model and parameters. From this, we arrive at a posterior distribution that describes the conditional distribution of the parameters given the observed data. If we denote the model parameters by θ, and the data by x, we can first write the prior as p(θ), and the likelihood of the data as p(x|θ). In the Bayesian framework, we apply Bayes rule to update the prior distribution having observed the data, giving us the posterior distribution as
(2)p(θ|x)=p(x|θ)p(θ)p(x),where p(x) is known as the evidence or marginal likelihood, and plays a key role in Bayesian model selection. Evaluation of the marginal likelihood is often computationally expensive or intractable. However, in many settings (e.g. when sampling from the posterior using Markov chain Monte Carlo techniques), it is sufficient to be able to write down the posterior up to proportionality
(3)p(θ|x)∝p(x|θ)p(θ).
This approach relies on the ability to calculate both the prior of the parameters p(θ), which is generally tractable, and the likelihood p(x|θ). However in many models of interest it is not tractable or not possible to directly evaluate p(x|θ), for example in population genetics (Beaumont et al., 2002), random graph models (Thorne and Stumpf, 2012) and some models of dynamical systems (Liepe et al., 2014;Toni et al., 2009). To allow us to perform Bayesian inference in these situations, an approach named ABC was developed, based on initial work inFu and Li (1997) andTavaré et al. (1997), developed further inBeaumont et al. (2002) andMarjoram et al. (2003), and expanded in many works, see for exampleSisson et al. (2007);Toni et al. (2009);Beaumont et al. (2009);Del Moral et al. (2012);Prangle et al. (2018).
In an ABC framework, we rely on the observation that given the ability to sample realizations y from p(x|θ), we can rewrite the posterior as
(4)p(θ|x)=∫p(θ,y|x)dy,where
(5)p(θ,y|x)=1(x=y)p(y|θ)p(θ)p(x),and by relaxing this to
(6)p(θ,y|x)≈1(D(x,y)<ϵ)p(y|θ)p(θ)p(x),we can generate samples from an approximate posterior (which we shall refer to as the ABC posterior) by using a suitably small ϵ in Algorithm 1. Often when applying the rejection algorithm, we fix the number of samples S and select ϵ such that the set of samples θs^ with ds<ϵ is some fraction αS.Algorithm 1 ABC rejection sampler algorithm1: fors∈1,…,Sdo2:  Sample θs^∼p(θ)3:  Simulate y∼p(y|θs^)4:  Calculate ds←D(g(y),g(x))5: end for6: Return samples θs^ where ds<ϵThe ABC rejection sampler algorithm requires us to define a distance on the data, D(x, y), and in some cases this may itself be intractable. It is then possible to substitute a summary statistic of the data, g(x) in place of the data itself, leading to a distance on these summary statistics D(g(x),g(y)) being considered. In the case where g is a sufficient statistic for the model, as ϵ→0 this will be equivalent to applying a distance on the x and y themselves. Often this is not the case, and this is another avenue through which ABC produces an approximation to the posterior rather than a true evaluation of the posterior itself.


	3.3 Topological statistics for approximate Bayesian computation

In previous work,Nardini et al. (2021) applied topological statistics of simulated data (2D binary images) to quantify different regimes in the parameter space of the Anderson–Chaplain model of angiogenesis. By constructing simplicial complexes from the output data of a spatial model, and using the same filtration asNardini et al. (2021), PH can be applied to describe the presence of topological features in the simulated data.
In some cases when calculating the persistence of the topological features of a filtration, it is possible for some features to persist indefinitely, so that their death in the filtration is represented as +∞. In our application, this causes information about certain topological features to be lost, for example loops and some connected components, as although we know when they are born in the filtration, we have no measure of their extent. For this reason,Nardini et al. (2021) computed persistence of a left to right sweeping plane filtration and right to left sweeping plane filtration of the simplicial complex built from the simulated model data [seeNardini et al. (2021) for details]. By viewing the left to right filtration as a sublevel set filtration and the right to left filtration as a superlevel set filtration, more information (e.g. only finite bars that capture the extent of topological features) can be extracted as a consequence of duality and symmetry theorems (Cohen-Steiner et al., 2009).


	3.4 Extended persistence

Here, we propose a more elegant solution that applies the extended persistence ofCohen-Steiner et al. (2009), which forces all topological features to be of finite length. Extended persistence was developed to study cavities and protrusions in protein docking (Agarwal et al., 2006;Cohen-Steiner et al., 2009). Since then,Yim and Leygonie (2021) optimized spectral wavelets for graph classification using extended persistence, and extended a differentiability result for ordinary persistence to extended persistence.
In standard persistence, the sublevel sets Xa=f−1(−∞,a] of the manifold X are nested and PH is defined through the corresponding linear sequence of homology groups. In extended persistence, we compute the homology of the sublevel sets, as well as the relative homology with respect to the superlevel sets Xa=f−1[a,∞). For a set of values a0,…,an that bound and fit between the critical points of f, the extended persistence in dimension k is defined as the persistence of the homology groups and relative homology groups as
(7)0=Hk(Xa0)→Hk(Xa1)→…→Hk(Xan)=Hk(X)Hk(X)=Hk(X,Xan)→…→Hk(X,Xa0)=0where Hk(X,Xa) denotes the relative homology group of X and X [a] in dimension k (Edelsbrunner and Harer, 2010).
This extended persistence can be broken down into multiple components (Cohen-Steiner et al., 2009), the ordinary part, formed of topological features that are both born and die within the homology groups of the sublevel sets of X, the relative part of features that are born and die in the relative homology groups, and the extended part of features that are born in the ordinary homology groups and die in the relative homology groups in the filtration. The birth time b of a feature may be larger than its death time d due to the possibility that the feature dies in the relative homology group H(X,Xd) with d < b. The extended part can be further divided into topological features that have b < d, termed extended+, and those with d < b, termed extended–.


	3.5 Persistence images

The output of applying PH to a dataset is often represented as a persistence diagram, that for a given dimension k consists of a plot of points (b, d), where b is the time of birth and d is the time of death d of each dimension k topological feature in the filtration. To allow for the straightforward application of methods from machine learning to these diagrams,Adams et al. (2017) developed the concept of a persistence image. This allows a persistence diagram to be represented as a vector in Rn, so that for example it can be used in methods such as K-means clustering, as inNardini et al. (2021).
To generate the persistence image corresponding to a persistence diagram represented as a multiset of points ( b, d), the points are first transformed to give a multiset B of birth and persistence coordinates (b,d−b) (for extended persistence, we require a slightly different formulation—see below). We note that the persistent image formulation ofAdams et al. (2017) ignores all infinite persistent features. A persistence surface in R2→R is then defined as the weighted sum of kernels applied to each birth/persistence coordinate
(8)f(x,y)=∑(b,p)∈Bg(b,p)h(x,y;b,p),where g(b, p) is the weight of the feature and h is a suitable kernel. From the persistence surface defined inEquation (8), an m × m array of values is created by discretizing f(x, y) into an m by m grid in a suitable range. This array can then by flattened to give a vector in Rm2. As inAdams et al. (2017), we apply a Gaussian kernel for h with mean μ=(b,p) and fixed standard deviation σ.
We remark that extended persistence only has finite persistence; therefore, no information (i.e. the infinite bars in ordinary persistence) is lost in the persistence images for extended persistent homology.


	3.6 TABC

We use a set of topological statistics derived from the extended persistence of a filtration over the simplicial complex representing the data as the summary statistics in an ABC framework, in a method we title TABC, to perform topological posterior inference on the Anderson–Chaplain model of angiogenesis. In the TABC methodology, the summary statistics used in ABC are the persistence images in each dimension produced by the by the four components of the extended persistence of a filtration. To allow persistence images to be generated for the extended persistence, in components of the extended persistence with points in the persistence diagram ( b, d) with d < b, we flip the coordinates to consider instead (d, b), which when transformed into a birth/persistence coordinate then represents the duration of persistence of the feature in the relative part, or the gap between birth in the ordinary homology and death in the relative homology of the feature in the extended–part. We generate persistence images of dimension 50 by 50 with a constant weight function for the persistence surface and the kernel of the persistence images set as a multivariate Gaussian distribution with standard deviation σ = 1, as we found this to work well. As the distance metric in the ABC algorithm, we applied the Euclidean distance between the statistics. In our implementation we use the GUDHI library ( http://gudhi.gforge.inria.fr/) to construct simplicial complexes, generate extended persistence diagrams and produce persistence images (with standard weighting g = 1).


	3.7 Image-based statistics

For comparison, we also consider four statistics based on the binary image data produced by the simulations, that were chosen with the aim of differentiating the different classes of behaviours observed inNardini et al. (2021), without overlapping with features that could be considered as topological descriptors (e.g. numbers of connected components). These statistics are:

 **  Mean X coordinate: The mean X value of occupied pixels.
**  Mean Y coordinate: The mean Y value of occupied pixels.
**  Maximum X coordinate: The maximum X value of an occupied pixel.
**  Mass: The fraction of occupied pixels.
As with the topological statistics, we applied the Euclidean distance between vectors of statistics as the distance in the ABC rejection algorithm.
