Stochastic and deterministic models for the metastatic emission process: formalisms and crosslinks

Although the detection of metastases radically changes prognosis of and treatment decisions for a cancer patient, clinically undetectable micrometastases hamper a consistent classification into localised or metastatic disease. This chapter discusses mathematical modelling efforts that could help to estimate the metastatic risk in such a situation. We focus on two approaches: 1) a stochastic framework describing metastatic emission events at random times, formalised via Poisson processes, and 2) a deterministic framework describing the micrometastatic state through a sizestructured density function in a partial differential equation model. Three aspects are addressed in this chapter. First, a motivation for the Poisson process framework is presented and modelling hypotheses and mechanisms are introduced. Second, we extend the Poisson model to account for secondary metastatic emission. Third, we highlight an inherent crosslink between the stochastic and deterministic frameworks and discuss its implications. For increased accessibility the chapter is split into an informal presentation of the results using a minimum of mathematical formalism and a rigorous mathematical treatment for more theoretically interested readers.


Introduction
Metastasis is the spread of cancer cells to distant tissues, broadly divided into two steps, physical dissemination and tissue-specific colonisation (Chaffer and Weinberg, 2011). While the first part is facilitated by a reversible phenotypic change of cancer cells (Yu et al, 2013), successful colonisation involves complex tumour-microenvironment interactions and is still not well understood (Sahai, 2007;Nguyen et al, 2009).
Being responsible for most cancer-related deaths, metastasis is a pivotal point in disease history (WHO, 2015). However, since metastases smaller than approximately 10 7 cells remain undetectable by medical imaging or other diagnostic tools, the clinical appearance of nonmetastatic disease may not reflect the true metastatic state of a patient. Therefore, estimating the metastatic risk in cancer patients without visible metastases is of major clinical importance (Pantel et al, 1999). In this respect, mathematical and statistical techniques have the potential to derive risk scores from clinical data.
Today, there is a large body of mathematical oncology literature; a recent review by Scott et al (2013b) specifically focuses on the metastatic process. Here, we will briefly summarise modelling efforts focusing on the metastatic risk.
The emergence of a metastatic phenotype is governed by a number of key mutations (Gupta and Massagué, 2006;Hanahan and Weinberg, 2011). Michor et al (2006) and Haeno and Michor (2010) translated this assumption into mathematical models, thereby deriving a metastatic risk score from evolutionary principles. In an opinion paper, Anderson and Quaranta (2008) qualified approaches explaining emergent behaviour through lower-level mechanisms as "the essence of integrated mathematical oncology". While acknowledging the importance of such approaches for improving our understanding of cancer biology, we will here focus on more data-driven models, featuring simpler principles and thereby better matching the limited amount of information available in a clinical situation.
In an early work, Koscielny et al (1984) established a link between primary tumour size at surgery and risk of recurrence from a large cohort of breast cancer patients. Later, Michaelson et al (2002) explained these and other data heuristically (see also Section 2.1). In addition to such phenotypic characteristics, specific genetic signatures of the primary tumour have been found to be associated with increased metastatic risk (van de Vijver et al, 2002). These risk prediction models are static; they do not aim at representing the time evolution of the disease.
In contrast, dynamic models allow to predict the modelled system at different times, and more easily integrate data obtained at different observation times. To represent the dependency of the metastatic process on the primary tumour, a dynamic description of primary tumour growth is also integrated into metastatic models. The simplest dynamic model for tumour growth is the exponential model, which adequately describes growth under no restrictions (e.g. in vitro). In many cases of interest however, especially in vivo, sigmoidal (s-shaped) models with an initial exponential phase and subsequent deceleration are better suited to describe growth dynamics. The Gompertz model is a classical example that has been commonly used (Wheldon, 1988;Norton, 1988;Hahnfeldt et al, 1999). Power growth models have also been used for the description of clinical (Hart et al, 1998) and preclinical tumour growth data (Benzekry et al, 2014b). Although much more complicated models have been developed, e.g. describing the spacial evolution of a tumour, here we restrict our discussion of primary tumour growth to the simple models introduced above. A stochastic dynamic model for metastasis was proposed by Bartoszyński et al (2001). Their approach described the emission times of metastases as random events, formalised through a so-called non-homogeneous Poisson process with an emission rate increasing with primary tumour size (more details in Section 2.2). A variant of this model successfully described data on bone lesions from a metastatic breast cancer case (Hanin et al, 2006). The Poisson law allows for an interpretation of the emission process as being "memoryless" and many of its properties can be analysed mathematically.
Dynamic models for metastasis can also be used to infer parts of the process that cannot be observed experimentally or clinically. In this respect, Iwata et al (2000) proposed a partial differential equation (PDE) model to describe the size distribution of metastatic colonies (the approach is explained further in Section 3.1). Thereby, they characterised the micrometastatic state of a hepatocellular carcinoma patient from clinical information on visible metastatic colonies. Later, this size-structured model was also successfully incorporated into mixed-effects models to predict the size evolution of metastases in animal models without (Hartung et al, 2014) and with (Benzekry et al, 2015) surgical removal of the primary tumour. A unique feature of this approach is that it allows to integrate secondary metastatic emission into the model. Indirect evidence on the capacity of metastases to spread further was given both through cancer network models (Chen et al, 2009;Newton et al, 2012Newton et al, , 2013) and through the discovery of self-seeding mechanisms (Comen et al, 2011;Scott et al, 2013a).

Outline
In this chapter, we focus on the two dynamical frameworks for metastasis described above, the Poisson process and the size-structured model. Three aspects are covered: • an accessible introduction to the Poisson process framework with an example motivating the approach, • an extension of the Poisson model to account for secondary metastatic emission, • the inherent link between the (extended) Poisson model and the size-structured model.
Finally, we exploit the crosslink between the two frameworks in order to evaluate the adequacy of the modelling assumptions in the deterministic model and to realise simulations using both frameworks together. We restrict our discussion to models describing the natural history of metastatic progression. While surgery of the primary tumour can be represented in these models, to incorporate the effect of systemic treatments a more general formalism would be required. For the ease of presentation, we will not include this layer of complexity here, although we point out that both frameworks have been extended to cover much more general cases, including systemic treatment (Hanin and Zaider, 2011;Verga, 2010) and more complex interactions between primary tumour and metastases (Benzekry et al, 2014a). For increased accessibility for readers with a non-mathematical background, the present work is split into two parts. First, the concepts behind these approaches are presented informally (Sections 2 and 3), breaking down the mathematical formalism as much as possible. Then, Section 4 contains rigorous definitions of all mathematical objects; detailed proofs can be found in Appendix A.

Metastatic risk
Predicting the probability of metastatic disease at diagnosis of the primary tumour is of major clinical importance since it is strongly linked to survival expectancy. One possibility to build such a prediction model is by using large databases to correlate information on the presence of metastases to primary tumour characteristics at diagnosis or surgery. As an example, we show a relationship established by Michaelson et al (2002) between primary tumour size at surgery and probability of metastasis, based on clinical data on breast cancer: where d is the largest diameter of the primary tumour at surgery, and c, z are parameters, which were determined from the cohort data. Such a relationship can be given by a mechanistic interpretation on the following premises: Power growth. The growth of the largest primary tumour diameter d follows a power law, i.e. it is the solution of the ordinary differential equation In this equation, a PG determines growth speed and α allows to describe different growth shapes: exponential growth (α = 1), linear growth (α = 0), and a spectrum of sigmoidal growth patterns in between (0 < α < 1).
Power law of emission. During each (infinitesimal) time interval, there is a chance that the primary tumour emits a metastasis. The emission intensity λ depends on the current size of the primary tumour through a power law: λ(t) = b d(t) β . In this context, the parameter b can be interpreted as the metastatic aggressiveness of the emitting tumour. Iwata et al (2000) linked β to the mode of vascularisation of the primary tumour: a uniform vascularisation would correspond to β = 3 (dimension of space) and a surficial vascularisation to β = 2 (dimension of a surface). 1 Memorylessness. The probability of emission of a metastasis is independent of the previous emission history.
A typical model allowing a mathematical formalisation of the above premises is the Poisson process. In principle, randomness of the metastatic emission process could also be represented through different probability laws, thereby dropping the memorylessness property (as done e.g. by Bethge et al, 2015). However, the Poisson model has several advantages: it does not require any additional statistical parameters, it has a high degree of analytical tractability (i.e., many of its properties can be investigated through mathematical analysis, not only by simulations), and there are efficient numerical routines to simulate the process (e.g. by thinning, see Lewis and Shedler, 1979). The detailed derivation of the empirical relationship is presented in Section 4.2.

Poisson processes
A Poisson process (PP) is a model for counting a series of events occurring at random times. The precise definition of this process is given in Section 4, but its basic properties are the two following ones (see Fig. 1 for an illustration): 1. The number of events in disjoint time intervals are independent. This translates the memorylessness property since given some time t, the number of future events (those happening at any time t future > t) do not depend on the past events (those happening at any time t past < t), but only depend on the present state of the system at time t. For example, in Fig. 1), the time elapsed between t and T (4) 2. The number of events N t that occurred by time t has a Poisson distribution with parameter meaning that for the probability of having observed exactly k events until time t is given by These two properties characterise the PP, and can even serve as a definition in addition to N 0 = 0. From these two properties, one can show that the probability that the next event time lies between times t and t + ∆t is approximately λ(t)∆t. Hence, λ determines the event frequency, and this is the reason why it is called the intensity function.
In the setting of this chapter, we are interested in describing the inception times of new metastatic lesions via PPs. This means that N t is the number of metastases emitted until time t in our context. Following Bartoszyński et al (2001) and Hanin et al (2006), we will first suppose that only the primary tumour has the capacity of seeding metastases.
A constant emission intensity λ (called a homogeneous PP) would mean that a tumour consisting of a few cells is equally likely to shed a metastasis as a large tumour of several grams. Since such a model is not realistic, we need to consider time-varying intensities λ (called non-homogeneous PPs). We will consider an emission intensity λ that depends on some measure of primary tumour size X p (t) (diameter, volume, number of cells,...). The relationship between primary tumour size and emission intensity is given by a size-dependent emission law γ, i.e. λ(t) = γ(X p (t)).
Before going into more detail, let us introduce a set of clinical parameters (summarised in Table 1), which will be used throughout this chapter to further illustrate the concepts. These parameters were estimated by Iwata et al (2000) from clinical data on a hepatocellular carcinoma with multiple liver metastases. Although derived within the deterministic framework of the size-structured model (see Section 3.1 for more details), the inherent link with the PP framework described in Section 3.2 ensures that these parameters are relevant in the PP model also; we will therefore use the same set of parameters in both frameworks. Also, we will make use of a slight modification of this clinical setting to predict the risk of distant metastasis after surgery. To represent the impact of a surgery at time t surgery , the emission intensity will be set to zero for all times larger than t surgery .

Model
Parameter Symbol Value Unit Growth (Gompertz law) Initial size Emission power β 0.663 - Table 1: Growth and emission laws derived by Iwata et al (2000) from clinical data of a hepatocellular carcinoma case with multiple liver metastases. Primary tumour size is given by X p = g(X p ), X p (0) = x 0 , and primary tumour emission rate is given by . This set of parameters is used throughout the chapter; when used in the PP framework, the emission rate λ is taken as the emission intensity.
Randomness of emission means that each emission time can be represented via its probability density function; this is illustrated for the emission time of the first metastasis T (1) in Fig. 2.
The number of metastases N t is itself random in this model, but relevant deterministic quantities can be derived from N t , such as the expected number of metastases E[N t ] or the probability of metastatic disease P(N t > 0). Exploiting the memorylessness property of PPs, these quantities can be computed without any need to simulate the process (all the following formulas are proven in Appendix A): and Also, a formula for the variance of N t is obtained readily: The concepts N t , E[N t ] and var[N t ] are illustrated in Fig. 3. If a metastatic growth law is added to the model, the total metastatic mass (or total cell count, sum of lesion volumes) M t -again a random quantity-can be represented via the emission times of the PP. M t can be compared to quantitative measures of total metastatic biomass, obtainable e.g. via bioluminescence imaging (Hartung et al,  (Table 1)

. Its analytical formula is
2014). We will assume that all metastases follow the same deterministic growth law X m , but which can be different from the primary tumour growth law. Therefore, the size difference among metastases is entirely explained by differences in metastatic inception times, and M t can be written as where X m (0) = x 0 m is the initial size of a metastasis. Expectation and variance of the metastatic burden can also be calculated analytically: The assumption of equal growth law for the metastases greatly simplifies the model, which has both its advantage (for identifiability from clinical data) and drawback (for correct representation of cancer biology). Although beyond the scope of this chapter, it could be replaced by a less restrictive assumption, e.g. by supposing that individual growth parameters are drawn randomly from a normal distribution. However, even if easily integrated into numerical algorithms, such a feature would be prohibitive for any characterisation of the model through mathematical analysis.

Secondary emission
In the model described above, metastases do not have the capacity to emit metastases themselves. However, it is easy to think of a case in which such a property would make a difference in the model. For example, suppose that only a single metastasis is emitted prior to surgery of the primary tumour 2 . If this metastasis cannot emit further metastases, its successful removal cures the patient, but the second surgery may fail if the metastasis is able to seed as well. Of course, there are other mechanisms potentially leading to treatment failure (e.g. local recurrence, surgery impossible, etc.), but for simplicity these are not considered here.
In this section, we extend the previously shown PP model to account for secondary metastatic emission, using PPs as building blocks. Many of the advantages and limitations of the PP model carry over to the extended model, and we do not claim that a comprehensive framework for cancer metastasis is built in that way. The model does have the virtue, however, that it is simple enough to have a chance to be parametrised reasonably from clinical data.
Conceptually, the extension is straightforward: as before, the primary tumour grows according to X p and metastatic emission by the primary tumour is represented by a PP with intensity λ p . In addition, any emitted metastasis has the same capacities as the primary tumour, but possibly with different growth and emission rates (X m instead of X p , λ m instead of λ p ). If we consider a metastasis emitted at time s, this means that at a later time t, it reaches the size X m (t − s) and emits metastases with intensity λ m (t − s). Every newly emitted metastasis starts a new PP. Also, each metastasis has a precursor (either the primary tumour or another metastasis). The whole model then consists of the metastatic emission times from all of these PPs. Since each PP can start other PPs, we call this model a PP cascade. We then need to make a hypothesis on how the different emission processes play together.
Independence of emission. We assume that each metastasis emits independently of any other metastasis, and also independently of the primary tumour. In other words, all the PPs involved in the dynamics are independent.
Such an assumption is very important to be able to characterise the properties of the PP cascade with mathematical techniques. Note that simply ordering all emission events including secondary emissions by increasing emission time would not allow to use above independence assumption since the inception time of each metastasis depends on its level in the generational hierarchy (primary tumour, metastases emitted from the primary tumour, metastases emitted from the metastases emitted from the primary tumour, etc.). Therefore, in our model we have to account for the filiation of each metastasis. For example, the emission time of the first metastasis emitted by the primary tumour is denoted by T (1) , and the emission time of the first metastasis emitted by the first metastasis emitted by the primary tumour is denoted by T (1,1) , which depends on T (1) . More precisely, T (1,1) = T (1) +T (1,1) whereT (1,1) is the first emission time for the PP generated at time T (1) . Filiation in the cascaded model is further illustrated in Fig. 4; a rigorous definition is provided in Section 4.4.

Size-structured model
Let us consider a different framework for the description of metastasis, which also represents the metastatic process purely as growth and emission dynamics. To describe the micrometastatic state of cancer patients, Iwata et al (2000) developed a size-structured model. The model describes the time evolution of a density function ρ(t, x) representing the size distribution of metastatic colonies: the integral x 2 x 1 ρ(x, t)dx represents the number of metastases at time t with size between x 1 and x 2 . Therefore, ρ is like a smoothed histogram of the number of metastases within different size ranges.
To better understand why a size density is considered, it is instructive to draw an analogy to Lagrangian and Eulerian description of a fluid flow. In a Lagrangian description, the observer follows individual particles through the flow field. In contrast, in a Eulerian description, the observer considers the flow density through fixed reference points. These two frames of reference are illustrated in Fig. 5. In this picture, metastatic growth becomes "flow through size space". In the PP model, this is represented in a Lagrangian fashion: a growth function is associated to each individual metastasis. In the size-structured model, a Eulerian frame of reference is used: the entire population of metastatic tumours is described through a density function moving through size space at a "speed" g(x) (i.e., the growth rate), in other words a size-structured density. Formalising metastatic growth from an Eulerian point of reference leads to a PDE model. Metastatic emission is the boundary condition of the PDE, which means that it describes the "arrival of new particles into size space" (see Eq. (9)). In contrast to the PP cascade model, where metastatic emission was a stochastic process, emission is deterministic in the size-structured model. The emission dynamics consist of a primary tumour contribution and a contribution of the metastases themselves, both depending on the size of the emitting tumour. The size-dependency of the metastatic emission rate entwines metastatic growth with metastatic emission dynamics, which requires special attention during mathematical analysis of the model (Barbolosi et al, 2009) as well as for designing an efficient numerical resolution scheme (Hartung, 2015).
To illustrate the model dynamics, the clinical parameters of Table 1 were used to simulate the metastatic density function at different times (see Fig. 6). The model equations, together with relevant properties of the size-structured model, are presented in Section 4.3.

Bridging the gap: model observables
We now describe how the size-structured model and the PP cascade model are related. At first view, the two frameworks describe quite different objects. While the PP cascade is concerned with a collection of emission times with a generational hierarchy, the sizestructured model features a density function. Nevertheless, as we will see, the latter can be seen as the expectation of the PP cascade model. To describe precisely the relationship between the models, we need to introduce model observables as a common theme. In fact, we have already introduced some model observables without naming them so, namely the number of metastases, the number of micro-/macro-metastases, and the total metastatic mass.
Let us start with the size-structured model. For each function f , a model observable where x 0 m is the size of a newly emitted metastasis and x ∞ m denotes the theoretical upper boundary, i.e. it is integrated over all possible sizes of metastases. Different choices for f are possible, and each of them corresponds to one observable (this dependency is made explicit through the subscript f in MO f ). The definition includes the above mentioned quantities: • The number of metastases is obtained for f = 1, i.e. the function that equals 1 for all x: • Similarly, the number of macrometastases is obtained with and the number of micrometastases with where c is the detectability threshold, which depends on the imaging modality.
• The total metastatic mass M is obtained with the identity function f Id (x) = x for all x: Apart from allowing us to consider all these model-derived quantities at once, it is important for the mathematical proofs in Section 4 to consider such a general notion of observable. Writing down the model observables in the PP cascade model is slightly more complicated and it will be easier to illustrate it with an observable in the PP model without secondary emission. There, a stochastic model observable (SMO) is defined by The observables are defined in such a way that their interpretation is the same in both frameworks. For example, f = 1 yields the number of metastases N t and f Id yields the metastatic mass M t If we ordered all emission events including secondary emissions by increasing emission time (and still called these times T (1) ,T (2) , etc.), this could also be used as the definition of a stochastic model observable in the PP cascade. However, in order to carry out the calculations required for bridging the gap between the two frameworks, we need to account for the filiation of a metastasis, i.e. its level in the generational hierarchy. An explicit definition of the SMO using filiation is provided by Eq. (13) in Section 4.4. Similarly to Eq. (2), where the expected number of metastases E[N t ] was computed in the PP model without secondary emission, an expression for the expectation and variance of each SMO can be derived in the PP cascade model. These computations are more complicated and are presented in detail in Appendix A. It is then shown that the expected value of each SMO is equal to the corresponding MO in the size-structured model; in this sense, the size-structured model describes the mean behaviour of the PP cascade model: A rigorous mathematical statement of these results is given in Section 4.4. The relationship between model observables in the two frameworks is a consequence of a relationship between more fundamental mathematical objects (a random measure in the PP cascade and an absolutely continuous measure in the size-structured model). For the sake of simplicity, we do not present this additional layer here and refer to Section 4.4 for more details.

Implications
In physics, a density is usually derived on the hypothesis of a large number of constituting particles. In their derivation of the size-structured model, Iwata et al (2000) applied these principles to metastasis. However, this density notion is challengeable during the early phase of metastasis where the number of metastases is low: what is one single metastasis spread over the whole size range? The alternative interpretation as the expected value of a cascade of PPs provides a more flexible framework. For any model observable (e.g. the number of metastases), the adequacy of the size-structured model can be evaluated by quantifying the variance of the corresponding PP cascade.
Let us illustrate this approach by an example. When Iwata et al (2000) parametrised the size-structured model from clinical data on the size distribution of metastatic colonies, their model did not represent randomness inherent in the emission process. To account for this neglected source of variability, we use the crosslink between size-structured and PP cascade models. By simulating the PP cascade model with the same parameters (Table 1), standard deviation as well as typical trajectories of the stochastic model can be taken as a measure of variability around the prediction by the size-structured model. We choose the observables that Iwata et al (2000) used to parametrise their model, i.e. the number of metastases exceeding certain size thresholds c (i.e. f macro with different thresholds). Simulation results are shown in Fig. 7.
The deviation of the data from the size-structured model prediction are much smaller than the stochastic fluctuation of the PP cascade model, and we can interpret this from two different perspectives. Since these deviations are consistent with typical trajectories, the data are in principle explainable by stochasticity of emission. On the other hand, if we simulate a new patient with the stochastic model, the model parameters are likely to change purely due to stochasticity of emission. To put it differently, the precision of the parameters of the size-structured model is probably overestimated since the variability by randomness of emission is not taken into account.  N macro (t), which was used to fit the clinical data from (Iwata et al, 2000) (computed via Eq. (10) Iwata et al (2000), we count time from inception of the first primary tumour cell, which was back-calculated from primary tumour data assuming Gompertzian growth (hence the first CT scan with metastatic disease is approximately 3 years post-inception).
Parameter estimation is much easier in deterministic than in stochastic models. Nevertheless, Hanin et al (2006) were able to estimate model parameters in their PP model without secondary emission. However, if the statistical model becomes more complicated, e.g. a mixed-effects model to deal with population data (Lavielle, 2014), the computational and even methodological feasibility limit is quickly reached with a stochastic structural model (Tornøe et al, 2005). In this case, the crosslink described in Section 3.2 can be exploited to derive reasonable parameters for simulations from the PP cascade model by estimating the parameters of its mean, i.e. the size-structured model.
We applied this reasoning already in Fig. 7, in which the variability due to stochasticity of emission was discussed. To give an example using the stochastic nature of the PP cascade model more explicitly, we now use the stochastic framework to assess the impact of secondary metastatic emission following surgery of the primary tumour. Using the clinical parameters stated above, we assume that the primary tumour is surgically removed 500 days after its inception, where it has reached a tumour mass of 180 g, and assess the number of metastases another 500 days later (Fig. 8). Since every secondary emission is preceded by at least one primary emission, the probability of metastatic disease is the same in both models. However, on average a much larger number of metastases is predicted from the model with secondary emission (E[N t ] = 4.7 with secondary emission vs. E[N t ] = 1.2 without).  Iwata et al (2000), it is assumed that the primary tumour is surgically removed 500 days after its inception, and that the number of metastases is evaluated another 500 days later.
This chapter focused on Poisson processes as possible frameworks describing metastatic emission, usable to predict metastatic risk or micrometastatic dynamics. Although representing randomness in a relatively simple way, PPs have appealing properties which we have illustrated here. They are easily included as building blocks in larger models, which has been shown with the PP cascade model, but which applies in a much more general way. Also, they allow for a high degree of analytical tractability, which was exploited here to characterise the mean behaviour of the PP cascade model. Without doubt, further improvements of these techniques are required. In particular, to allow for individualised risk predictions, patient characteristics have to be matched to model parameters. In this respect, circulating biomarkers, such as circulating tumour cells or circulating tumour DNA, can be a useful source of information, especially since quantification methods are rapidly getting more reliable (Yu et al, 2013;Bulfoni et al, 2016;Paoletti and Hayes, 2016). Both frameworks presented here allow for such an extension. Once validated, a mathematical model can serve as a powerful tool for informed treatment decisions for cancer patients by integrating case-specific information into a consistent quantitative framework.

Mathematical formalism and results
This section is devoted to the mathematical formalism and the derivations of the results of sections 2 and 3. We provide here precise definitions of the mathematical objets and rigorous computations and results. We start by introducing the non-homogeneous Poisson process and derive formula (1) from the Poisson assumption. Next, we summarise key results for the size structured model, we introduce the probabilistic framework for secondary emissions, and then derive rigorously the link between these two models.
Throughout this section (Ω, F, P) will be a probability space on which all the random variables we consider are defined.

Definition of a Poisson process and basic properties
The Poisson distribution is a standard way to count the occurrences of some events.

Definition 4.1 (Poisson distribution)
The parameter µ ∈ R + can be interpreted as the expected number of occurrences since In our context, it counts the number of metastases. However, at this level, we have no information on the event times we are counting, nor how this number evolves with respect to time. To handle the random nature of these times, let us introduce the Poisson processes.

Definition 4.2 (Non-homogeneous Poisson process)
Let λ : R + → R + be a continuous function. We say that (N t ) t≥0 is a non-homogeneous Poisson process with intensity λ if: 2. the number of occurrences in disjoint time intervals are independent, i.e. for t 0 < · · · < t n , the random variables N t k − N t k−1 , k = 1, . . . , n are independent; 3. For all t > 0, N t has a Poisson distribution with parameter Λ(t), given by The terminology non-homogeneous comes from that the intensity function λ can vary in time, as opposed to a homogeneous Poisson process for which λ is constant. Also, there are several equivalent definitions for a non-homogenous Poisson process. For instance, the third item above can be replaced by the following properties: where o(∆t) stands for a function satisfying o(∆t)/∆t → 0 as ∆t → 0. In the previous definition, we chose λ continuous since it is the case in the situations we consider. Nevertheless, this assumption can be relaxed to more general non-negative functions. Finally, from a Poisson process (N t ) t≥0 , one can define the event times recursively as with T (0) = 0. We refer to Fig. 1 for an illustration of the relation between (N t ) t≥0 and the event times. From these times, one can consider the following (random) measure on R + : where δ x stands for the Dirac distribution at point x. This measure is called the Poisson random measure associated to (N t ) t≥0 . From this definition of P , it is direct to see that for any t ≥ 0, Here, 1 A is the indicator function of A, that is it takes the value 1 if A is true and 0 otherwise. In the same way, the total metastatic biomass (8), for instance, can be rewritten as Expressions involving an integral with respect to the Poisson measure allow convenient manipulations, as we will see in Appendix A using Proposition A.1.

Derivation of empirical formula from Poisson assumptions
Let us assume that the primary tumour diameter d(t) follows a power law: Power growth of volume with a power between 2/3 and 1 has been described in the literature, leading to the above model if we assume a spherical shape of the tumour. Furthermore, let us assume that metastatic emission is governed by a Poisson process with intensity λ(t) = b d(t) β . We will require β > 0, since the emission rate should increase with primary tumour size. Then, the number of metastases N t by time t is Poisson distributed with parameter Λ(t) = t 0 λ(s)ds and the probability of metastasisfree disease at time t is given by and assuming that the tumour is initiated with a negligible size (d 0 This then yields the empirical formula Since β > 0 and α ≤ 1, we have z > 0, and the above manuipulation is justified.
It should be noted that although c and z can be determined unambiguously from information on metastatic status and primary tumour size at surgery if the patient cohort is large enough, this does not hold for the growth and emission parameters of the underlying Poisson process. To distinguish the growth and emission processes additional information is required, such as repeated tumour size measurements over time.

Summary of key results on the size-structured model
The framework proposed by Iwata et al (2000) focuses on the evolution of a sizestructured metastatic density ρ. Originally, it was assumed that primary and secondary tumours have the same growth and emission patterns. Here, we present an extended version, described e.g. in Hartung (2015), where primary and secondary growth and emission dynamics can be different.
As before, X p denotes the size of the primary tumour and γ p the primary tumour emission law. The size of a metastasis is given by X m , which is the solution of an autonomous ordinary differential equation The emission law of the metastases is γ m . Then, the metastatic density function ρ solves the following equation: (9) We also introduce the emission rates of the primary tumour and the metastases, respectively: λ p (t) := γ p (X p (t)), and λ m (t) := γ m (X m (t)).
Existence of a unique weak solution ρ to this model has been shown under general conditions by Barbolosi et al (2009). For the purpose of this chapter, it is sufficient to assume that λ p , γ m , and g are continuously differentiable nonnegative functions, and that lim t→+∞ λ p (t) < +∞ and g(x ∞ m ) = 0. Model observables for the size-structured model have been introduced in Eq. (6). As shown by Hartung (2015), they can be characterised as the solutions of a Volterra convolution equation: ), MO f is the unique solution of the following renewal equation: This alternative formulation will be important to bridge the gap between the stochastic and deterministic frameworks. Of note, it is also the basis of an efficient numerical resolution algorithm (Hartung, 2015).

Probabilistic framework for secondary emission
Let us first remind the reader that we assume all the emissions of the primary tumour and the metastasis to be independent. To exploit this property in calculations, we have to take care of the filiation of each metastasis, i.e. the generational hierarchy (the primary tumour, the metastases emitted from the primary tumour, the metastases emitted from the metastases emitted from the primary tumour, etc.). We will therefore introduce a cascade of independent PPs, and define recursively the emission times with respect to the generational hierarchy.
• The emission times for the first generation of metastases, that is, the one emitted by the primary tumour, are the event times of a PP (N (1) ) t≥0 with intensity λ p ; we will write Π (1) := (T (j) ) j≥1 for the set of random emission times.
The emission times for the next generations of metastases are defined recursively.
We refer to Fig. (4) for an illustration of these emission times, but for instance, T (2,3,4) is the inception time of the 4 th offspring produced by the 3 rd offspring of the 2 nd offspring of the primary tumour. Using biologically relevant parameters, the expected emission times for all but the first few generations are larger than any reasonable observation timeframe. However, even if the contribution of late generations is very small, we need to consider the whole cascade of emission times to bridge the gap to the size-structured model. Finally, assuming that (N (n 1 ,...,n k ) ) t≥0 , k ≥ 1, n 1 , . . . , n k ≥ 1 is a family of independent PPs implies the biological assumption we made, which is that the primary tumours and all the metastases emit independently from each other. In the PP model without secondary emission, model observables were defined in Eq. (7). With the PP cascade defined above, we are now able to extend this concept to secondary emission constructively. For f ∈ L ∞ ([x 0 m , x ∞ m ]), the SMO for the k-th generation can be expressed by SMO (k) f (t) := n 1 ,...,n k ≥1 1 (T (n 1 ,...,n k ) ≤t) f X m t − T (n 1 ,...,n k ) , and we have the following definition.

Definition 4.3 (Model observables with secondary emission)
The SMOs with secondary emission are given by In this definition, SMO (k) f describes the contribution of the k-th generation to the SMO. The following proposition links the MOs from the stochastic and deterministic frameworks.

Proposition 4.1 (Link to the model observables)
is finite for all t ≥ 0 and satisfies (10), so that Let us remark that the SMO (13) may also be seen as integrals w.r.t. a random measure for any t ≥ 0, with M t := k≥1 n 1 ,...,n k ≥1 δ Xm(t−T (n 1 ,...,n k ) ) .
This description is the key point to bridge the gap to the description of metastasis via a structured population equation.

Theorem 4.2 (Link to the structured population model)
is σ-finite, absolutely continuous with respect to the Lebesgue measure, and its Radon-Nikodýn density is given by ρ(·, t), where ρ is the solution of the structured population equation (9).
The last result we present here concerns the variability of the SMO (13) with respect to its mean MO f .

Proposition 4.2 (Variance of observables)
is finite for any t ≥ 0, and satisfies a renewal equation: Here, and where SMO m,f is defined as (13), but for a different cascade of PPs, which has only λ m for intensity (both for the first and subsequent generations).
This result is of great interest to design confidence intervals as illustrated in Section 3.3. In fact, the renewal equation (15) allows the use of an efficient numerical resolution algorithm (Hartung, 2015).

A Appendix: Proofs of results of Section 4.4
The proofs provided in this section are based on the following classical result on Poisson random measures. We refer to Ç ınlar (2011, Chap. 6 pp. 251) for further details. Also, note that this result directly yields Eqs. (2) through (5).

A.1 Proof of Proposition 4.1
We first need to establish the following lemma, which is proven further below.
Lemma A. 1 We have e f = λ p * (f (X m ) + e m,f ), where e m,f has been introduced in Proposition 4.2.
This is not exactly the renewal equation we want. To derive the desired equation (10) we just have to make the following remark. Taking λ p = λ m , Lemma A.1 gives that e m,f satisfies e m,f = λ m * (f (X m ) + e m,f ).
Hence, from (17), we have Now, let T > 0, we have from the last line of (18) that for all t ∈ [0, T ], t 0 e f (u)du, which gives using Gronwall's inequality As a result, e f (t) < +∞ for all t ≥ 0 since T is arbitrary, and also P SMO f (T ) < +∞ = 1.