Tracking of multiple fluorescent biological objects in three dimensional video microscopy

A method which allows for the first time to perform automatically the detection and the tracking of microscopic objects directly from three dimensional image data is presented. It enables to analyse biological moving objects in three dimensional fluorescence image sequences coming from biological immunomicroscopy experiments, and get quantitative data such as the number of objects, their position, movement phases and speed. After a detection step is performed through the multiscale analysis of images using a shift-invariant wavelet transform, the tracking is achieved using a Kalman filter and an association which enable the position of the moving objects to be predicted, refined and updated. Trajectories are analysed in terms of different parameters relevant for the motility analysis of biological objects.

perspectives dedicated to the study of cellular dynamics, and to the links between cellular functions and spatial-temporal localization. For example, nowadays, it is routine in cell biology to film cells, organelles, and pathogens to document infectious disease processes in living systems [1]- [4]. In many such applications, the fluorescently labeled biological objects are visualised as small bright spots superimposed on an uneven background, where a spot is defined as an object which is relatively small and compact and has no clear border, and whose intensity is both diffuse and higher than that of its immediate neighborhood. The reliable quantitative study of motility and dynamic properties requires the computation of parameters like number of spots, position, spatial distribution, movement phases, speed, and diffusion coefficients and, therefore, requires that all the spots in the image sequences are detected and tracked [5], [6]. In the context of living cellular systems, spot tracking can be made particularly difficult by the facts that the dynamics of each spot can change over time or that spots may aggregate temporarily making their appearance change and their proper detection more difficult [6]. Not surprisingly, many biological object motility studies are based on the study of a few hand-picked particles which represent only a small subset of the total [2]. On top of being tedious and time consuming, this type of manual tracking is prone to many errors, highly dependent on operator's skills and perception [1], and can introduce a strong bias in the analysis. It is, therefore, of utmost importance to develop methods to perform the tracking in an automatic, reproducible, and unprejudiced manner.
A large body of work has addressed the tracking of single particles in optical microscopy (see, e.g., [7], for a review) and of multiple spots in military or video imaging [8], [9]. Conversely, relatively little effort has been devoted to multiple particle tracking (MPT) in the cellular and molecular imaging domain [10]- [17]. Conventional MPT methods are based on simple intensity thresholding [10] or local maxima extraction [11] for detecting the spots and on nearest neighbor association (NNA) [10] or constrained NNA [11] to perform the tracking. These methods work well on image sequences showing a limited number of very bright spots on a uniform background, but they fail as soon as the intensity of the spots is not modal, the spots are embedded in noisy images, the density of spots is high, or the displacement of spots is higher than the interspot spacing. In [12], a solution to improve on the previous methods is proposed. It uses the combination of four techniques, namely highly sensitive object detection, fuzzy logic-based dynamic object tracking, computer graphical visualization, and measurement in time space. Although this method is effective when working with well separated objects, it shows some limitations in keeping up with aggregating ones. A more robust algorithm is presented in [13], [14]. Based on ideas from operational research and graph theory, it proceeds in four steps: particle detection; generation of candidate matches, i.e., a set of possible displacement vectors between successive frames; scoring of candidate matches; and selection of the candidate subset with maximum global score and no topological ambiguity. This method, however, relies on a number of heuristic rules and on a priori information on the movements of objects to be effective. Also, the extension and application of the algorithm to four-dimensional data has not been reported.
Here, we report a method to perform the detection and the tracking of microscopic spots directly on four dimensional (3-D+ ) image data. It extends and outperforms our previous work [15]- [17] by being able to detect with high accuracy multiple biological objects moving in three-dimensional (3-D) space and by incorporating the possibility to follow moving spots switching between different dynamics characteristics. Our method decouples the detection and the tracking processes and is based on a two step procedure: first, the objects are detected in the image stacks thanks to a procedure based on a 3-D wavelet transform; then, the tracking is performed within a Bayesian framework where each object is represented by a state vector evolving according to biologically realistic dynamic models. The main advantage of wavelet-based detection is to be robust to the local variation of contrast and to the imaging noise. The Bayesian tracking allows us to predict the new position of a spot knowing its past positions and increases the reliability of the data association step.

B. Bayesian Multitarget Tracking
Bayesian multitarget tracking methods consist of filtering successive measurements coming from a detector. They are generally split into two subprocedures: Bayesian filtering and data association. The generic formulation of Bayesian filtering is to compute the successive posterior densities from the set of already detected measurements and the following system: where is the measurement of a hidden evolving signal obtained at each time step through the observation process (2) with a time prior (i.e., the assumption made on the evolution of state through time) given by (1). The exact solution to this problem is constructed in two steps. First, a prediction of the prior density is computed by based on a Markov assumption on the process . Then, from (3), and thanks to the Bayes rule, the update of the posterior density is obtained as (4) In practice, however, this scheme cannot be applied because the multidimensional integral in (3) and (4) cannot be computed in the general case. Several solutions have been proposed to make the problem tractable by imposing some constraints on the density distributions. Depending on the constraints and on the a priori information available on the system, the solution to the posterior density can be optimal like in the case of the well known Kalman filter (KF) (optimal in the MMSE sense for a linear Gaussian density) [18], suboptimal (yet more efficient to approximate the posterior density in nonlinear/non-Gaussian cases) in the case of the extended Kalman filter (EKF) [8], the Kalman unscented filter [19], and the particle filter (PF) [20]- [22] or exact in the case of the grid-based filter (GBF) [22]. All the previous filters are bound to use just one dynamic model in their scheme, however, which is problematic when the objects' dynamics vary with time as it is the case with biological objects. The interacting multiple models (IMMs) filter [23], [24], instead, has been designed, first in the context of radar imaging, with the capability to have different models in parallel, and to select and switch to the model which is more accurate to represent the movement during a given period. The IMM has the additional ability to rapidly self adapt to transitions. This altogether makes the IMM the best choice for our application and here we report its first application to biological imaging and propose several models adapted to biological object dynamics.
Data association is a crucial step when tracking several particles in the presence of clutter. Indeed, as the detection may miss or wrongly detect objects, naive association methods will create inconsistent associations leading to disconnected trajectories in subsequent images. Among the numereous methods that have been developed to tackle this problem, the most popular algorithms are the nearest neighbor (NN) [8], the joint probabilistic data association (JPDA) [9], [25], and the multihypothesis tracking (MHT) [26], [27], but none of them is really adapted to our specific needs. The NN is very sensitive to noise and performs quite badly even with moderate noise levels. The JPDA, which computes the joint probability of all the different associations, requires the number of objects be known in advance and remain constant during the sequence, a condition not met in our applications where the number of particles varies in time. The MHT algorithm is very compelling conceptually as it is based on the dynamic computation of the probability tree describing all possible associations of particles over time. Notwithstanding its theoretical advantages, it rapidly becomes intractable because the complexity increases exponentially with the number of tracked objects. There are no established and reliable implementations for applications where the number of considered objects exceeds a few units; thus, the MHT could not satisfy our demand of being able to handle several tens or hundreds of particles. We, therefore, propose an association method based on a maximum likelihood approach that has proved to achieve better results than an optimal NN (i.e., that minimizes the distance sum) approach with a reasonable complexity.

C. Tracking of Vesicles Labeled With Quantum Dots (QDs)
To validate the performances of our method on real biological microscopic data, we have tracked endocytic vesicles to monitor the effect of the overexpression of the protein tau in the microtubule dependent transport of single vesicles in living Hela cells. Vesicles were marked with red fluorescent inorganic CdSe/ZnS QDs. We compared the results obtained in nontreated cells with those in cells transfected with tau-GFP. Our results are in agreement with the work from another team [28] where the analysis of microscopy data was done manually and confirm that tau does not alter the speed of motors but reduces the mean 3-D trajectory range of endocytic vesicles [28]. This confirmation is very encouraging and it opens the way to an increased use of automatic tracking methods to analyze complex biological image sequences.

D. Organization of the Paper
The paper is organized as follows. In Section II, we briefly describe the detection method. Section III presents the different algorithms used to perform the tracking together with an evaluation of the IMM filter. The results achieved by the presented method are then evaluated in Section IV, where results are presented both for synthetic sequences and for real biological microscopy sequences.

II. SPOT DETECTION
We have developed a spot detection method which is based on a shift invariant 3-D discrete wavelet transformation of the image stacks and on the selective filtering of wavelet coefficients. This method extends our previous work on two-dimensional (2-D) images [29] and presents several advantages over simpler segmentation methods: 1) It is robust to local variations of contrast and to the imaging noise. 2) It allows us to efficiently segment fluorescent spots imbedded in noisy images such as those acquired with real-time confocal microscopy. 3) It enables to select spots of a specified size. 4) It allows us to discard low-frequency objects that are present in the background and that would hinder any conventional threshold based method. 5) It allows us to take into account the difference in resolution along the axis by selecting filters of different length in , , and (typically, we use a B3-spline filter [30] in the and directions and a Haar filter in ). Given an input noisy image stack , and its wavelet transformation , wavelet thresholding is performed with Jeffreys' noninformative threshold [31] as an estimation of the significance of wavelet coefficient at a given scale and position (5) where is the standard deviation of noise at scale in the wavelet domain (we assume that the noise is stationary at each scale). The choice of this threshold function is motivated by the fact that other functions like Donoho's hard or soft threshold [32] are too drastic in cutting wavelet coefficients and do not allow for weak objects to be detected. To characterize the spots, we compute a correlation image stack which is the direct multiplication of denoised wavelet image stacks corresponding to the selected scales (6) The correlation reduces the remaining noise and has significant values only at locations that correspond potentially to spots. After binarization, each connected component (spot) in is extracted to create a measurement vector consisting of its location , volume and mean intensity at time [16]. This vector is used in the following steps to create the spot trajectories. An example of detection result is presented in Fig. 1.

III. TRACKING
To establish the trajectories, it is necessary to link the successive vectors representing the set of available measurements on the system. To avoid creating inconsistent associations leading to disconnected trajectories in subsequent images, two important facts have to be taken into consideration. 1) A detection does not necessarily correspond to a real object but could have been generated by clutter, and 2) an object might have been missed (not detected). To properly handle these situations, the tracking algorithm has to be able either to terminate a track if a wrong measurement has been generated or to provide a predicted measurement that will be used to temporarily retain a track in case a real measurement can be assigned in the subsequent images. Another major problem is that biological object dynamics vary abruptly and frequently over time leading to association problems. To handle this, we use an IMM filter with several dynamic models that allow us to achieve a good prediction during the transition between movements.

A. IMM Filter
The IMM filter is a Bayesian state estimation algorithm for a system represented by Markovian switching coefficients. It was originally developed by Blom [24] and used by Blom and Bar Shalom [23] first in the context of radar imaging. We recall, here, the principles of this method. We want to compute the posterior density as a weighted sum of gaussians based on the assumption that the generic system presented in (1) and (2) is reduced to a set of switching linear models with additive Gaussian noises and, thus, has the form where is the system state vector at time , is the measurement vector at time , and is the state transition matrix for the event (event defined as "model is active at time "). is the observation matrix for . and are the process and measurement noise vectors, that are mutually uncorrelated zero-mean white Gaussian processes with covariance matrices and . We denote by the index of the models. The switching between models is governed by a finite state Markov chain with probabilities of switching from to . The optimal approach for filtering the system state would be that every possible sequence of models be considered, thus involving an exponentially increasing number of filters. To make the computation tractable, the IMM estimator [9], [23] uses instead an approximation which is to consider the two most recent sampling periods only. The IMM estimator is a recursive process consisting of several steps at each iteration.
First, the initial input for each KF at time step (also called the mixed estimated state and covariance, and noted with a "0" superscript) is computed by where the conditional model probabilities are given by (11) and the predicted model probability by (12) Second, for each filter, the state prediction and covariance prediction are computed for each model with the standard KF equations [8], [18] with initial values given by (9) and (10). These predictions are used in the association step which provides to each track the measurement that corresponds to the best global solution as detailed in Section III-D and from which it is possible to compute the state estimation and the covariance estimation . These estimations can be computed in parallel for each model as they are totally independent from each other.
Third, as the innovation is considered to be a -dimensional Gaussian statistic with zero mean, it follows that the likelihood of each filter matched to is given by (13) where and are, respectively, the innovation and the covariance of the innovation of KF .
Then, the probability for is (14) and the combined state estimate and covariance are

B. Capability of the IMM
To illustrate the capability of the IMM to self-adapt to different movement types as well as to the switching between them, we simulated a one dimensional signal that switches between two linear models. To simulate the noise and the measurement process , the signal was corrupted by sequentially adding two Gaussian noises with different variance to the original signal. The measurement was then filtered back independently by a KF using one of the models and an IMM filter with both models being active. Fig. 2 shows that the IMM produces an estimation that is globally much more reliable than the KF. Conversely, the KF is more efficient on the local portions of the signal ( to 70) that corresponds to its model, which is well in accordance with the fact that the KF is the optimum solution for a linear Gaussian system. Fig. 3 illustrates that switching between models is tractable thanks to the multimodal posterior density which is computed by the IMM.

C. Proposed Dynamic Models
To adapt the IMM to biological imaging, we propose to use three different models of dynamics: random walk (RW), firstorder linear extrapolation (FLE), and second-order linear extrapolation (SLE). They model Brownian motion and directed movement with constant speed or acceleration, which are representative modes of motion encountered in biology [3]. We make the additional realistic hypothesis that, during movement, the biological objects can switch abruptly between the three models. This description is complemented by the hypothesis that the  volume and the mean intensity of each spot remain approximately constant with an additive gaussian noise. By writing the state vector as a concatenation of three successive locations (17) each of three models can be represented by a linear application, thus enabling us to take into account in the IMM up to three consecutive vector states while still lying within a Markov process of order 1. The three models are as follows.
1) The RW model which makes the assumption that the next state is defined by the previous state and an additive Gaussian noise 2) The FLE model which makes the assumption that the next state is defined by the linear extrapolation of the last two 3-D locations while keeping the same intensity and volume 3) The SLE model, which makes the assumption that the next state is defined by the linear extrapolation of the three last 3-D locations while keeping the same intensity and volume The observation model related to the measurement vector (II) is common to all three models (18)

D. Association
To form the correct tracks, it is necessary thereafter to go through an association stage. This consists in finding the best assignment between the new measurements obtained at the detection step and the predicted measurements provided by the IMMs. The assignment is made as follows. First, we compute the maximum likelihood of the innovation among the models in each IMM that we denote . That is to say, for each measurement , and each IMM filter , , we find the value which maximizes (13). Each of them is used to form the following matrix: . . . . . . (19) This is done only for the measurements which are within the validation gates (i.e., the Mahalanobis distance between predicted and estimated measurements is inferior to the square of a given which is determined from a table as corresponding to a gating probability chosen to be at least 0.95).
The assignments are then found according to the following: (20) with (21) The search for assignments stops when . Even though this assignment technique is sub optimal in a global sense, i.e., it may occur that (22) it produces in our application context much better results than an algorithm for optimal assignment between two sets like, for example, the Jonker and Volgenant algorithm [33]. When dealing with our data, the JVC algorithm was not able to output correct associations on more than the first five time points in a sequence. This can be explained by the fact that the JVC algorithm requires the cardinality of the two sets be equal, an assumption that is not met in our case as we are trying to resolve an ill-posed problem where the number of predicted measurements never equals the number of measurements.

E. Refinement
Once the association of each prediction with a measurement is accomplished, each IMM filter (track) is refined. Reliable state estimations of each object are computed with (15), while (14) enables each IMM to update the probabilities of their associated models.

A. Validation of Automatic Tracking on Synthetic Data
In order to assess the quality of our method and its suitability to biological data, we have generated sequences with artificial . Note the similarity with real microscopy data in Fig. 1. (b) Detected spots. spots whose characteristics are as close as possible to fluorescent spots. The spots are represented by 3-D Gaussian shapes with different random covariance matrices in order to get different shapes of different sizes. For example of generated spots and of their detection, see Fig. 4.
To test the robustness of the tracking algorithm to the density of spots, we included 10, 20, 30, or 40 spots in image stacks (the resulting spot density then being 0.1, 0.2, 0.3, and 0.4, for 10, 20, 30, and 40 spots) and generated five sequences of 30 time points for each condition (20 sequences in total). Here, the size of the test volumes was kept small in order to achieve relatively high densities of spots within tractable volumes. In each sequence, the spots were made to move randomly and their direction changed randomly at a random time of between one and five frames, with an additional random modification of the covariance matrix of the 3-D Gaussian shapes with time to make their aspect change. To simulate real world conditions, spots were allowed to temporarily aggregate and cross. Also, a simulated background was generated by using a mixture of Gaussians with high variance and white Gaussian noise was added to the sequence in order to represent the noise present in typical microscopy images. Finally, the image stacks have a lower resolution in the direction to simulate the anisotropic resolution of 3-D microscopy images. Table I gives a summary of the influence of the density of spots within the same volume on the performances of the algorithm. The results show that even with a high density of spots the performances of the algorithm are satisfactory and that the IMM clearly outperforms the results achieved by using a KF with any of the three models (RW, FLE, or SLE) taken alone. Fig. 5 shows a synthetic sequence that was generated with 160 synthetic spots in a volume similar to the real data shown in Fig. 6, resulting in a similar spot density of . By applying the IMM filter to the data on this sequence, we could achieve a tracking with 85% of true positive   Table I.

B. Tracking of QDs
We used our method to investigate the effect of the over-expression of the protein tau in the microtubule dependent transport of single vesicles in living cells. Tau is a microtubule associated protein (MAP) which regulates microtubule polymerization [34]. Independent experiments performed by a different team [28] have shown that tau does not alter the speed of moving vesicles but it affects the frequency of attachment and detachment of vesicles to the microtubules [28]. To monitor vesicle tracking we marked them with red emitting fluorescent inorganic CdSe/ZnS QDs and imaged them with time lapse 3-D fluorescence microscopy. A typical image stack acquired with the experimental setup described in Appendix I is presented in Fig. 6.
In the experiments, we monitored the dynamics of QD-containing vesicles over periods of several minutes without observing a decrease in the fluorescence emission. We compared the results obtained in nontreated cells and in cells transfected with tau-GFP for a total of about 30 cells in eight sequences (several cells can be found in one sequence). For each sequence, between 150 to 200 vesicles were tracked at the same time. For each trajectory, we computed the transport speed, the trajectory range R, defined as the largest distance separating the first and any other point in the sequence, and the mean-square displacement (MSD), defined as MSD < >, where is the distance traveled in seconds. Fig. 7 shows the MSD curves corresponding to the movement of vesicles in cells for four sequences in the presence of tau (red) and for another four sequences in the absence of tau (blue) (each curve is the average of the MSDs for all vesicles in any of the sequences) and shows that in tau-transfected cells the motion is confined in a smaller volume than in the case of the nontransfected ones. From the data analysis, we found a mean speed of 0.12 and 0.11 m/s and a mean trajectory range of 1.48 and 0.95 m for nontreated and transfected cells, respectively. Therefore, we can confirm the result in [28] that tau does not affect the mean motor speed. On the contrary, as already suggested by Fig. 7, the overexpression of tau induces a reduction of almost a factor 1.6 for the value for the 3-D mean trajectory range. Also, from the values of the MSD curves corresponding to the two groups, we found the diameter of the mean diffusion volume to be 1.7 m in absence of tau and 1.0 m in presence of tau.

V. CONCLUSION
We have presented a method to detect and track fluorescence spots presenting different and switching movement types in 3-D+ live biological microscopic images. The method uses a shift-invariant 3-D wavelet transform for the detection of spots, an IMM algorithm with different transition models for the prediction and estimation of the state of the object and a data association method based on the maximum likelihood for establishing trajectories. We have shown on generated sequences and on real microscopy data sequences that this multistage algorithm makes it possible to establish valid trajectories even in presence of a high density of spots. We have assessed the validity of our method by monitoring the effect of the overexpression of the protein tau in the microtubule dependent transport of single vesicles marked with QDs in living Hela cells and confirmed independent results that tau does not affect the mean motor speed. We now plan to apply our method to study the role of pathogen dynamics in the triggering of infectious diseases.

APPENDIX I QDS SYNTHESIS AND MICROSCOPY SETTINGS
The synthesis of colloidal CdSe/ZnS nanoparticles is described in detail in a previous paper [35]. The hydrophobic particles were silanized with phosphonate silane as described in [36] to achieve water solubility. HeLa cells were grown in Dulbecco's modified minimal essential medium (Gibco, Cergy-Pontoise, France), supplemented with 10% fetal calf serum. For the experiment they were plated at a concentration of cells/ml on 12.5 cm round glass coverslip. After 24 h, the cells were transfected with the tau-GFP vector. Hydrophilic fluorescing nano-crystals with an emission wavelength of 637 nm were added to the medium to achieve a resulting concentration of 10-nM nanocrystals. The cells were then incubated for additional 4 h to allow the cells to completely ingest the nanocrystals and then rinsed two times with PBS to remove residual nano-crystals in the medium. Samples where observed with an inverted microscope (Leica) equipped with an 100 oil-immersion objective and a 100-W mercury lamp. For every image, one channel in differential interference contrast mode and two individual channels in the fluorescence mode were recorded using the GFP filter set to image tau-GFP and a red filter set to image the QDs. The time-lapse sequences were recorded using the software Metamorph (Universal Imaging, WS). Nanocrystal subcellular traffic was tracked by recording on the red channel five planes separated by 0.5 m. For each plane, we used an integration time of 200 ms, which results in total acquisition time of 1 s for each stack. For each cell, a total of 60 stacks were acquired at an interval of 1 s.