Indoor Tracking: Theory, Methods, and Technologies

In the last decade, the research on and the technology for outdoor tracking have seen an explosion of advances. It is expected that in the near future, we will witness similar trends for indoor scenarios where people spend more than 70% of their lives. The rationale for this is that there is a need for reliable and high-definition real-time tracking systems that have the ability to operate in indoor environments, thus complementing those based on satellite technologies, such as the Global Positioning System (GPS). The indoor environments are very challenging, and as a result, a large variety of technologies have been proposed for coping with them, but no legacy solution has emerged. This paper presents a survey on indoor wireless tracking of mobile nodes from a signal processing perspective. It can be argued that the indoor tracking problem is more challenging than the problem on indoor localization. The reason is simple: From a set of measurements, one has to estimate not one location but a series of correlated locations of a mobile node. The paper illustrates the theory, the main tools, and the most promising technologies for indoor tracking. New directions of research are also discussed.

Indoor Tracking: Theory, Methods, and Technologies Davide Dardari, Senior Member, IEEE, Pau Closas, Senior Member, IEEE, and Petar M. Djurić, Fellow, IEEE Abstract-In the last decade, the research on and the technology for outdoor tracking have seen an explosion of advances.It is expected that in the near future, we will witness similar trends for indoor scenarios where people spend more than 70% of their lives.The rationale for this is that there is a need for reliable and highdefinition real-time tracking systems that have the ability to operate in indoor environments, thus complementing those based on satellite technologies, such as the Global Positioning System (GPS).The indoor environments are very challenging, and as a result, a large variety of technologies have been proposed for coping with them, but no legacy solution has emerged.This paper presents a survey on indoor wireless tracking of mobile nodes from a signal processing perspective.It can be argued that the indoor tracking problem is more challenging than the problem on indoor localization.The reason is simple: From a set of measurements, one has to estimate not one location but a series of correlated locations of a mobile node.The paper illustrates the theory, the main tools, and the most promising technologies for indoor tracking.New directions of research are also discussed.
Index Terms-Bayesian filtering, data fusion, indoor tracking, simultaneous localization and mapping (SLAM), technologies for tracking.

I. INTRODUCTION
I NDOOR real-time locating systems (RTLSs) have been gaining relevance due to the widespread advances of devices and technologies and the necessity for seamless solutions in location-based services.An important component of RTLSs is indoor tracking where objects, vehicles, or people (in the sequel referred to as mobile nodes) are tracked within a building or any enclosed structure.Examples include tracking of products through manufacturing lines, first-responder navigation, asset navigation and tracking, operation of indoor unmanned vehicles, or people movers [1].The widely diffused Global Navigation Satellite System (GNSS) offers a worldwide service coverage due to a network of dedicated satellites [2].GNSS is recognized to be the legacy system in outdoor environments and, to a great extent, one of the most accurate sources of position information when it is available.However, its operation in indoor or obstructed environments is infeasible, and, instead, alternative systems have to be adopted.There are some ad hoc solutions for indoor tracking, and they are based on a large variety of technologies, from the early RTLSs that exploit ultrasounds (e.g., the ActiveBat system) to more recent impulse radio ultrawideband (UWB) techniques [2]- [4].In parallel, in the robotics community, where tracking is of crucial importance, systems for simultaneous localization and mapping (SLAM) have been investigated, mainly using laser and vision technologies [5].
A current trend in addressing indoor tracking is to use standard, low-cost, and already-deployed technologies.One driver of this activity is the enabling of smartphone-centered indoor positioning systems (IPSs) [6].In general, it is expected that the market opportunities for RTLSs and IPSs will be on the order of $10 billion yearly in 2024 [6].The technologies used in these systems are highly heterogeneous, encompassing Wi-Fi, UWB, radio-frequency identification (RFID), Bluetooth, near-field communication (NFC), Third Generation Partnership Project (3GPP)/Long-Term Evolution (LTE), signals of opportunity, and inertial measurement units (IMUs).It goes without saying that the latest challenge in indoor tracking (as well as localization) is not only to design specialized sensors for these tasks but to devise and implement data fusion methods that can exploit the already-available technologies.Data fusion in indoor tracking is a key element for further advances and presents exciting challenges particularly for signal processing practitioners and researchers.Due to the large variety of technologies and involved standards, a full understanding of the theoretical basics and a good mastery of advanced statistical tools are fundamental to allow for design of modern tracking systems.Real-time approaches that have been proposed are mainly based on the Bayesian filtering methodology, including variants of the Kalman filter (KF) and the much more versatile framework provided by particle filtering (PF) [7].These powerful statistical tools allow for a general way of coping with heterogeneous measurements, noise, and user mobility models.
In this survey, we introduce the problem of indoor wireless tracking of mobile nodes within a quite general framework.We also consider the mapping problem as it is tightly related to tracking.We present statistical-based methods that are available for resolving these problems.The main types of measurements and technologies that are used for tracking of mobile nodes are also discussed.

II. TRACKING ESTIMATION PROBLEM
Wireless tracking systems basically involve the presence of a number of reference wireless nodes (anchor nodes or landmarks) deployed at fixed locations and of one or more mobile nodes (often referred to as agents, targets, or mobile users; see Fig. 1).The terminology is not universal, and it often depends on the technology behind.For instance, in cellular-based tracking systems, the term base station (BS) is used for anchor nodes, whereas mobile station (MS) is reserved for a node that moves (an MS is sometimes called user equipment).In most applications, the time-varying positions of the mobile nodes are unknown and have to be estimated.On the other hand, the anchor nodes are located in a priori known positions.The locations of the anchor modes, however, may not always be known, as in SLAM [5].There, one aims at localizing a set of fixed landmarks and constructing a map of the surrounding environment by navigating through a predetermined path.
A standard problem of RTLS is localization, where the objective is to determine the location, in a coordinate reference frame, of one or more nodes with respect to reference locations typically marked by dedicated reference nodes.This process requires interactions among the nodes where, first, specific position-dependent measurements are performed between the nodes, and second, these measurements are processed to determine the position of the nodes with unknown locations.A typical example of measured data is the distance between the involved nodes.A detailed description of the types of measurements can be found in Section III.
In this review, we do not address the localization problemthere have already been published a number of papers with good reviews (e.g., [3] and [4]).Instead, we focus on indoor tracking, which can be viewed as a sequence of position estimations.Indoor tracking systems must not only determine the punctual position of the mobile node at a given time but also track and predict its trajectory in real time.Tracking can be achieved as a mere sequence of independent location estimates, regardless of the system history (i.e., without memory), but more frequently and efficiently, it involves the estimation of velocity and acceleration and all the past states of the mobile node.This is accomplished by using mobility models that describe the node's movement.
A different concept related to tracking and often confused with localization is navigation.It is based on past position estimates, and it consists in controlling the course and the current position of a mobile node with the purpose of following a predetermined path or of getting to a target destination.Navigation is often a component of SLAM systems, as well as tracking procedures.

A. Problem Statement
Consider a radio positioning system composed of a set of K nodes capable of interacting with each other through wireless signaling (see Fig. 1).The physical configuration of the kth node at discrete time step n (e.g., position, velocity, acceleration, and orientation) is described by the node's state x n , . . ., x (K) n .We point out that if the system has anchor nodes, their states are known and time independent.The tracking problem can be viewed as a statistical inversion problem where a time succession of (hidden) system states x 0:n = {x 0 , x 1 , . . ., x n } has to be estimated based on a set of noisy measurements y 1:n = {y 1 , y 2 , . . ., y n }.Specifically, y n is the set of measurements available at step n and, in general, includes the measurements between any couple of nodes y denoting the measurements acquired by node m about node k, if available.When the measurements can be taken not only between anchor and mobile nodes but among mobile nodes as well, then the tracking becomes cooperative.This model applies only when each measurement can be uniquely associated to the IDs of the nodes.There are, however, settings where the sensors do not know the ID of the nodes that produce the signals they sense [8].In this paper, we do not address tracking under these conditions.The measurements are useful for tracking if they are related to physical quantities affected by the geometric and inertial configuration of the nodes.By taking advantage of the radio propagation characteristics, the nodes extract positiondependent features from exchanged signals (internode measurements).A mobile node can also carry out self-tracking by using its own measurements (self-measurements) using on-board sensors such as IMUs.For example, y (k,m) n could represent the distance measurement between the kth and mth nodes, measured by node m, whereas y is the set of selfmeasurements of node k.More details on the possibly available measurements are given in Section III.
In this review, we focus on the Bayesian theory as a main workhorse for solving the addressed problem.This theory allows one to model the uncertainty about the system and the outcomes of interest by optimally combining prior knowledge and the information from observations (measurements).Within this framework, we want to compute the joint posterior distribution of the entire sequence of states given the measurements up to step n, i.e.,

p(x
where p(x 0:n ) is the prior distribution incorporating all the prior knowledge; p(y 1:n |x 0:n ) is the perception model, i.e., the likelihood for the measurements accounting for the observation noise; and p(y 1:n ) is a normalization constant.The joint posterior distribution (1) reflects all the up-to-date knowledge about the state of the system at step n.
The main drawback in applying (1) directly is that it has to be recomputed whenever a new measurement is taken by making the computational complexity intractable as n increases.The complexity can be drastically reduced by resorting to a simplified but widely used model in which the states x 0:n form a first-order Markov sequence so that p(x n |x 0:n−1 , y 1:n−1 ) = p(x n |x n−1 ), i.e., the state at step n depends only on the state at step n − 1.In addition, a current measurement y n is assumed to be conditionally independent of the past measurements and states, i.e., p(y n |x 0:n , y 1:n−1 ) = p(y n |x n ).In most practical cases, the measurements are also conditionally independent, which implies that p(y where p(y (m,k) n |x n ) is the likelihood of x n given the measurements y (m,k) n between nodes m and k.These assumptions lead to the probabilistic state-space Markovian model with the following ingredients: • p(x 0 ): prior information at time step 0, or the initial uncertainty about the state; • p(y n |x n ): perception (or measurement) model, or how the unknowns and observations relate; • p(x n |x n−1 ): mobility (or dynamic) model, or the prior information on the state evolution over time.
Often in theory and practice, one adopts the following general description of the state space model: where the function g(•) models the dynamics, the function h(•) maps the state to a measurement signal, and w n and ν n are the process and measurement additive random noise, respectively.

B. Performance Metrics
The requirements and performance metrics of interest are driven by the application.Due to the inherent uncertainties present in the system (e.g., the measurement noise), the node's state estimate will be characterized by errors as well.The position estimation error at time instant n is given by the Euclidean distance between the estimated position pn and the true position p n as e(p n ) = pn − p n .From the errors, one can construct various statistics that can be used as performance metrics of a given method.For example, one may use the root mean square error (RMSE) of the position estimates, i.e., where E(•) indicates the statistical expectation of the argument.We note that, in general, the expectation is a function of time.
In practical performance evaluation tests, the expectation is approximated by a set of L independent Monte Carlo trials as RMSE n (1/L) L =1 e 2 (p n ), with e (•) being the error at the th trial.The RMSE is often referred to as accuracy as it is a measure of the statistical deviation of the position estimate from the real position.A similar definition can be given with reference to the node's velocity and orientation.The position mean square error (MSE) is the sum of the MSEs of the estimates of the elements of p n [9].
It is always important to compare the tracking performance of a proposed method with the theoretically best achievable performance.To that end, we resort to the posterior Cramér-Rao bound (PCRB) [10].We recall that in tracking, the aim is to estimate x n modeled by (2) and observed via (3).If we define the covariance matrix of a particular estimator xn by C n , where then we must have1 where J n is the Bayesian information matrix, which is a sum of two information matrices, the observation information matrix, and the a priori information matrix.In (5), the state x n is random, and the expectation is with respect to x n and the random observations.We point out that the bound is computed for models (2) and (3), where the a priori and observation information matrices are obtained by exploiting (2) and (3), respectively.Moreover, it is a bound that is for the estimate xn of the complete state vector and not only of the node's position at time n.Every time instant has its own bound of the covariance of the state.A key issue in computing J n is to make the computation recursive.This is readily done for the linear state-space model.When the model is nonlinear, recursive equations can be formulated (see [11]); however, unlike in the linear case, the bounds have to be computed by simulations.With knowledge of the indoor environment and the used type of sensors, one can construct priors that allow for obtaining better PCRBs [12], [13].Recent results on the PCRB and other bounds can be found, for example, in [14] and [15].The RMSE may not be fully descriptive of the accuracy of an employed method.To correct for that, one may use temporal ratios of confidence in the estimate.For example, one could compute the percentage of time of an error being above some threshold.This representation can be seen as a localization error outage (LEO), where an outage event is defined as the event when the error exceeds a threshold e th , i.e., LEO(e th ) = P (e(p n ) > e th ) where P(A) indicates the probability of event A, and e th is the threshold (i.e., the maximum allowable position estimation error).The probability is evaluated over the ensemble of all possible spatial positions and time instants resulting in a global performance index [14].An equivalent index, which is often adopted in the literature, is the cumulative distribution function of the position estimation error defined by F e (e th ) = 1 − LEO(e th ).In specific applications, it could be of interest to evaluate the LEO only for a subset of spatial positions belonging to specific trajectories [16].In such cases, the calculated LEO would indicate the ability of the system to track the mobile node when moving along predefined trajectories, and hence, it would be trajectory dependent.Given the same level of accuracy, two different systems could give different LEOs.For example, a system characterized by LEO(1.5)= 0.1 (a precision of 10% within 1.5 m) is performing better than a second system characterized by LEO(1.5)= 0.5 (a precision of 50% within 1.5 m).One may use similar metrics for the velocity and the orientation.
Other performance metrics include the coverage and the robustness of the applied method.The former indicates the area where mobile nodes can be tracked by the method within specific accuracy, and the latter, the resistance of the method to some impairments including lack of radio visibility, measurement outliers, and node failures.When the tracking system operates in real time, the localization update rate, which is defined as the number of position estimates computed per second, is a parameter of importance, particularly in navigation systems.Scalability represents another important feature of a method.That is, not every method can be applied in large-scale networks and using low-cost nodes.

III. TYPES OF MEASUREMENTS
This section presents the types of measurements used for tracking.We classify them as 1) measurements directly related to the geometric constraints between nodes, 2) measurements that are not related to the geometric relationship among nodes, and 3) self-measurements with information on node acceleration and orientation.We also discuss the main sources of error that are present in indoor environments.

A. Geometric-Related Measurements
The optimum approach to manage the measurements (e.g., the received signal waveform) is to use them directly as input y n to the tracking estimator (direct position estimation).However, for complexity and implementation constraints, a more pragmatic but suboptimal two-step approach is followed in practical systems.It consists in estimating the geometric quantities from the signal features, such as the distance between nodes, and then in feeding the tracking estimator with these values (two-step position estimation).
Here, we give an overview of a number of wireless measurements that convey geometric constraints between nodes.
1) RSS: Distance estimation (or ranging) based on received signal strength (RSS) measurements relies on the principle that the greater the distance between two nodes, the weaker their relative received signals.This technique is commonly used in low-cost systems such as wireless sensor networks (WSNs) or Wi-Fi because of the easy availability of this type of measurement.The mapping between the measured RSS and the distance between the transmitting and receiving nodes is typically done by using theoretical and/or empirical path-loss models.
A widely used statistical model to characterize the RSS is given by [17] P r (d) = P 0 − 10 γ log 10 d + S (8) where P r (d) (dBm) is the received signal power at a distance d from the emitter, P 0 is the received power (dBm) at a reference distance of 1 m (which depends on the radio and antenna characteristics as well as the signal wavelength), d (m) is the separation between nodes, and S (dB) represents the large-scale fading variations (i.e., shadowing).It is common to model S as a Gaussian random variable with zero mean and standard deviation σ S .The parameter γ is known as the path-loss exponent, which, in indoor environments, typically assumes values between 2 and 6 [17].More sophisticated RSS models could also be considered, e.g., models that introduce dependence of σ S and γ on distance [18].
The main advantage of RSS-based approaches compared with other methods is the availability of RSS measurements in practically all wireless systems and the fact that the nodes do not have to be time synchronized.The most relevant drawback of RSS ranging is that in cluttered environments, the propagation phenomena cause the attenuation of the signal to be poorly correlated with distance, particularly in non-line-of-sight (NLOS) channel conditions, resulting in inaccurate distance estimates, as discussed in Section III-D.
2) Time of Arrival (TOA): Information related to the separation distance d between a pair of nodes can be obtained by using measurements of the signal propagation delay or time of flight (TOF) τ p = d/c, where c is the speed of electromagnetic waves in air (c 3 • 10 8 m/s).This is usually accomplished using a two-way TOA (TW-TOA) ranging protocol or time-differenceof-arrival (TDOA) techniques.
In TW-TOA ranging, node A transmits a packet to node B, which replies by transmitting an acknowledgment packet to A after a known or measured response delay τ d [3].Then, node A estimates the signal round-trip time τ RT = 2τ p + τ d , from which it can calculate the distance without the need of a common time reference.While synchronization offsets are intrinsically eliminated by the two-way protocol, a relative clock drift might still affect the ranging accuracy.
3) TDOA: Systems that use TDOA do not rely on absolute distance estimates between pairs of nodes.Such systems typically employ one of two schemes.According to the first scheme, multiple signals are broadcast from synchronized anchor nodes, and the mobile node measures the TDOA (this technique is similar to that adopted by the GNSS technologies).According to the second scheme, a reference signal is broadcast by the mobile node, and it is received by several anchors.The anchors share their estimated TOA and compute the TDOA.Each scheme requires that the anchors are tightly synchronized through a network.To calculate the 2-D position of the mobile node, at least three anchors and two TDOA measurements are needed.Ideally, each TDOA measurement can be geometrically interpreted as a hyperbola formed by a set of points with constant range differences (time differences) from two anchors [19].
4) AOA: Angle-based techniques estimate the position of a mobile node by measuring the angle of arrival (AOA) of signals arriving at the measuring node through the adoption of antenna arrays.With perfect measurements, the positioning problem can be geometrically solved by finding the intersection of a number of straight lines representing the signal AOA (triangulation).In 2-D scenarios, two AOAs are sufficient.In practice, noise, a finite number of antennas in the array, and multipath propagation might drastically impact the accuracy of the final position estimate [20].
5) PDOA: Phase-difference-of-arrival (PDOA) techniques were originally introduced for distance estimation in radar systems and have been recently rediscovered to improve the localization accuracy of RFID and WSN systems [21].The basic version of PDOA consists in transmitting a couple of continuous wave signals at frequencies f 1 and f 2 , respectively, and measuring the phase difference at the receiver that results to be proportional to the distance and inversely to the difference f 2 − f 1 .Due to the extremely small signal bandwidth, phase estimation errors can be very small, and hence, the distance estimation is quite accurate.Regrettably, 2π phase periodicity and the presence of multipath might create unavoidable ambiguities in evaluating the true distance.
6) Proximity: The simplest way to obtain informative measurements for positioning is proximity where binary connectivity is used to estimate the nodes' positions at time n.The location information is provided from the proximity of the mobile node to some of the anchor nodes in the system.One very simple model for the definition of proximity is the so-called circular radio coverage model or disk model, where the transmission range is modeled by a circle with fixed radius r 0 .A key advantage of the proximity technique is that it does not require any dedicated hardware and time synchronization among the nodes.This makes it particularly suited for very low-cost wireless devices such as RFID tags where the deployment of a large number of tags is not an issue [22].Starting from connectivity information, more sophisticated range-free positioning approaches can be introduced to enhance the tracking accuracy such as those referenced in Section IV-F.

B. Position-Related Measurements
Signals generated by predeployed infrastructures, such as Wi-Fi, broadcast stations (television and FM or AM radio), and cellular networks, are already present in most of indoor environments and can be potentially exploited for positioning without the need for deploying dedicated infrastructures [23], [24].Such radio signals of opportunity are designed for other purposes and are not intended for positioning.Indeed, these signals are usually received in NLOS channel conditions, and hence, their dependence on the relative distances/angles among the nodes is complex.As a result, inference from such measurements is very challenging.However, this does not prevent their exploitation for positioning if fingerprinting methods are considered, for example.In fact, such methods are based on the uniqueness of the measurements (fingerprint) at different locations that is exploited using mapping approaches described in Section IV-C.
In addition to radio signals of opportunity, the geomagnetic field has been recently proposed as a viable alternative (or complementary) signal of opportunity for positioning through the use of low-cost magnetometers that also provide orientation information.In fact, anomalies in the field caused by magnetic disturbances, typically present in indoor environments, can be used as a fingerprint [25].The main cause of these disturbances is the steel shells of most modern buildings.In [26], it has been experimentally shown that the magnetic field is stable for a long time and that its characteristics significantly change with location, making it suitable for fingerprinting approaches.Although in some scenarios the achievable accuracy can be on the order of a few centimeters, the availability of only three components of the magnetic field in the X, Y , and Z directions (two if the magnetic north is unknown) makes the uniqueness of the measurements as a function of position problematic.Moreover, interference from moving objects containing ferromagnetic materials and electronic devices might cause difficulties in modeling the consequent anomalies in the measured magnetic field.For this reason, magnetometers are typically coupled with other kinds of measurements (e.g., radio and inertial) via data fusion methods, as explained in Section IV-C.
A clear advantage in using signals of opportunity for positioning is that these are cost-effective solutions, since no additional infrastructure deployment is required.For completeness, we mention other signals of opportunity and sensors that can be exploited for tracking such as ambient audio and light [27], [28], ultrasound [29], and video signals [30].

C. Self-Measurements: Inertial Devices
Most handheld devices incorporate small and light IMUs based on microelectromechanical systems (MEMS) technology.Typically, an IMU is composed of three orthogonal gyroscopes and three orthogonal accelerometers [31].These triads of sensors measure angular velocity and a specific force, respectively.The specific force is a combination of gravitational and inertial linear acceleration.Some IMUs include 3-D magnetometers (delivering heading information) and a barometer/ altimeter, thus providing 10 • of freedom.Standalone inertial navigation is possible, given that the initial position, velocity, and orientation are known.The dominant type of inertial navigation system (INS) is the so-called strapdown systems, where the inertial sensors are rigidly mounted onto the device, and thus, the measurements are referred to the body frame.In this case, the gyroscopes are used to project the accelerometer observations onto the global frame, as well as to provide orientation information.Then, after correcting for gravity, one could integrate twice the accelerations and obtain the estimated position by the INS.
The generation of attitude, position, and velocity involves, in part, integration of the sensor measurements.Therefore, any error on the output of the sensors leads to correlated attitude, position, and velocity errors that are potentially unbounded.The error sources of the accelerometers can be modeled by a constant (deterministic) bias and a white noise (random) term, both a priori unknown.Other sources could be seen as time fluctuations of these two.Similarly, one could model the errors at the gyroscopes.These reasons prevent the consideration of standalone inertial navigation with a relatively low-cost IMU, leaving this approach to high-performance tactical grade devices.
IMUs are very popular in navigation systems, particularly when they are integrated with other technologies [32].The reason is the complementarity of errors between inertial sensors and the geometric-based approaches for position estimation.While an INS provides very accurate acceleration (and, thus, position) measurements, it produces an error that increases over time because of the sensor biases.On the other hand, the geometric-related measurements discussed in Section III-A are typically unbiased, at least in line-of-sight (LOS) conditions, but are more noisy than those of an INS.Therefore, proper data fusion of geometric-based systems with an IMU brings the best of both worlds: reduced variance and unbiasedness.Such data fusion can be optimally handled using Bayesian theory and the associated methodologies described in Section IV.

D. Main Sources of Error
The tracking performance of any method is highly dependent on the quality of the gathered measurements.Both technological constraints (e.g., device's clock accuracy) and radio propagation anomalies in indoor environments might cause sources of error, including multipath, thermal noise, direct-path excess delay, and NLOS channel conditions.
Multipath propagation can be severe in harsh scenarios.When narrowband systems are used, for example to extract RSS values, signal components coming via different propagation paths usually cannot be resolved.This results in destructive and constructive interference of components causing fading effects at a small-scale level, thus making the correlation of RSS with distance extremely weak.In wideband or UWB systems, multipath can be, in part, resolved, and accurate TOA signal estimation is possible.However, the presence of a high number of multipath components might make the detection of the direct path, carrying the correct distance information, a nontrivial task [33].
In small areas, time-based ranging relies on precise time measurements that are accomplished by equipping nodes with an oscillator from which an internal clock reference is derived.Physical effects can cause oscillators to experience frequency drifts that could be detrimental in systems with low-cost oscillators.Considering that the achievement of submeter ranging accuracy requires the estimation of TOF on the order of a few nanoseconds, estimation uncertainty of the received signal TOA and device's clock drift might not be negligible in indoor environments, even in the (ideal) absence of multipath.To get an idea on the basic parameters affecting the ranging accuracy, we illustrate the fundamental limit in the estimation of the TOA, i.e., τ , of a generic unitary energy signal s(t) with a spectrum S(f ) and transmitted through an additive white Gaussian noise channel.In the absence of other sources of error, the smallest variance of an unbiased estimator of τ is given by the CRB [34], i.e., where β represents the effective bandwidth of The corresponding CRB on ranging can be easily obtained by multiplying (9) by the squared speed of light c 2 .Notice that the lower bound in (9) reveals that signals with a large signal-to-noise ratio (SNR) and wide transmission bandwidth are beneficial for ranging.This justifies the large interest in the UWB technology in indoor RTLS [33].As a comparison, the CRB for a distance estimate d based on RSS measurements under the path-loss model ( 8) is given by [35] CRB RSS = ln 10 10 In contrast to time-based methods, the ranging capability using RSS measurements does not depend on the shape of the transmitted signal, but it rapidly increases with distance (with d 2 ).On the other hand, for time-based ranging methods, the signal shape, and hence the bandwidth, represents an additional degree of freedom to improve the ranging accuracy [as is evident from ( 9)].
Finally, we note that the most challenging problems for tracking in indoor environments are caused by NLOS measurements.When measuring RSS values, very poor or almost zero correlation between RSS and distance could be obtained, and such measurements can only be usefully exploited by nongeometric approaches such as fingerprinting tracking algorithms (see Section IV-C).In TOA-based approaches, if the direct path is completely obstructed, the receiver will only observe NLOS multipath components resulting in estimated distances larger than the true distance (outliers).Therefore, it is important to design tracking methods so that they are less susceptible to these errors (e.g., cooperative tracking schemes discussed in Section IV-B) or to introduce NLOS identification schemes and compensate for the positive bias present in NLOS measurements.In the literature, several NLOS detection schemes have been presented.Most of them rely on the extraction of received signal features that are mainly affected by NLOS propagation and take a decision based on some a priori statistical knowledge or learning approach [36], [37].It is worth mentioning that even if the direct path is not completely blocked by an obstacle, the measured TOA could be overestimated due to the excess delay experienced by the electromagnetic wave when traveling through different materials [38].
Other potential sources of errors are cochannel interference [39], which is caused by coexisting wireless systems sharing the same radio band, and environment variability that could make tracking methods based on fingerprinting less reliable.

IV. METHODS
In the previous section, we described various types of measurements that are used for indoor tracking.Tracking methods process these measurements to produce reliable sequences of position estimates of mobile nodes.The main challenge of the methods is the various types of errors in the measurements as well as the anomalies caused by the environment and its dynamic variations.
Here, we first present the main methods for indoor tracking, including cooperative and fingerprinting approaches.Then, we formulate the SLAM problem, which is intimately related to that of tracking, and review fusion methods, where information from measurements of different sensors is combined to obtain improved tracking.Finally, a brief overview of other methods is given.

A. Bayesian Tracking
In Section II, we introduced the problem of tracking by using a state-space model as given by ( 2) and ( 3).Here, we present the general solution to tracking provided by the Bayesian filter, illustrate the types of estimates one can obtain with the filter, and describe some of its implementations.
1) Bayesian Filter: In many practical settings, we are not interested that at a given time instant n, we get the full joint posterior distribution of the sequence of states x 0:n .Instead, the marginal posterior distribution p(x n |y 1:n ) of the current state x n given all the past measurements y 1:n is sufficient.This posterior quantifies the belief we have in the values of the state x n given the measurements y 1:n .Here, we present a recursive approach for the evaluation of p(x n |y 1:n ) that requires a constant number of computations at each time instant n.It is based on Bayes' theory, and therefore, we refer to it as Bayesian filtering.Its formulation is as follows.
• Initialization: The marginal at time step 0 is set to the prior p(x 0 ) of x 0 .• Prediction step: By exploiting the mobility model, the predictive distribution of state x n at time instant n is given by (11) • Update step: The marginal posterior distribution of x n , given the new incoming measurement y n at time instant n (and all past measurements), can be computed using the Bayes' rule and the perception model according to We observe that in the prediction step, the previous marginal posterior p(x n−1 |y 1:n−1 ) and the mobility model are used to obtain the predictive distribution p(x n |y 1:n−1 ).The new marginal posterior, i.e., p(x n |y 1:n ), is determined in the update step.The prediction step requires integration and the update step requires finding a product of two functions.We note that the denominator in ( 12) is just a normalizing constant.
2) State Estimation: Once the marginal posterior distribution p(x n |y 1:n ) of the current state x n is computed, we can obtain from it any point estimate or confidence interval that we desire.A point estimate xn of x n can be defined by using some criteria.Within the Bayesian methodology, the most common are the minimum mean square error (MMSE) and the maximum a posteriori (MAP) criteria [34].According to them, the respective point estimates are defined by It is well known that when the posterior distributions are Gaussian, the MAP and MMSE estimates coincide.
In some methods, the state transition model given by ( 2) is ignored, and thus, there is no prior constructed for the estimation of the state at the next time instant.In fact, the prior is considered to be proportional to a constant.If only current measurements are considered (without taking into account past state estimates to avoid exponentially increasing complexity), the posterior is obtained from the perception model only, and the point estimate that corresponds to the MAP estimate is known as the maximum-likelihood (ML) estimate.Formally, we write 2xML n = arg max When the measurements can be put in the form as in (3), and no statistical characterization is available for the measurement noise ν n , least squares (LS) estimators offer a valid alternative [38].When the measurement noise is Gaussian and the model is linear, the LS and ML estimates are the same.Theoretical performance bounds on ML position estimates can be found in [40] and [41].
3) Filtering Algorithms: We recall that the unknown state is considered a Markovian stochastic process and that it has to be sequentially estimated.Equations ( 11) and ( 12) provide the predictive and filtering densities of the state x n , and they are the complete solution to the tracking problem.The computation of these densities is difficult except in a few cases, including the ever important linear Gaussian model.In the latter case, the functions g(•) and h(•) in ( 2) and (3) are linear, and the process and observation noise is Gaussian.Then, the optimal solution is given by the least squares (KF) [42].We point out that this filter is also optimal in the case when we do not make distributional assumptions about the noise except that it is zero mean and with finite covariance.Then, the KF provides the optimal solution in the LS sense.
Once models (2) and (3) deviate from linearity, one has to resort to approximating approaches.Two popular methods are the extended KF (EKF) [42] and the unscented KF (UKF) [43], respectively.The former is based on linearization of the nonlinear function in the model and assumes that the noise is Gaussian.The latter avoids linearization and employs a deterministic sampling approach to parameterize the mean and covariance of the state vector.Basically, the integrals in the Bayesian recursion are numerically solved by the unscented transformation, which requires 2d x + 1 carefully selected points (also referred to as sigma points) around the mean and where d x is the dimension of the state.
Recently, there have appeared enhanced KF-like methods based on more precise numerical integration rules.They fundamentally differ from the UKF in the generation and weighting of the deterministic samples that are employed to propagate the mean and covariances of the distributions of interest.They are based on the cubature and Gauss-Hermite quadrature rules.For more information on these and their variants, see [44]- [46].
An important alternative to the given methods is PF [47].This method allows for tackling general nonlinear and non-Gaussian systems and, hence, is well suited for tracking problems.The main idea underlying the method is to approximate all the probability density functions of interest by probability mass functions with M samples (see Fig. 2).To that end, at every time step n, one uses samples of the unknown state x n,m and assigns to them weights, w n,m , m = 1, 2, . . ., M, where all the weights sum up to one.With these samples and weights, we approximate, for instance, the filtering density by where δ(•) is the Dirac delta pseudo-function.This way, integral operations simplify into sums.The PF method is a sequential method, and there exist various versions of it.Here, we explain its basic version known as the bootstrap filter [48].At every time instant, the bootstrap filter performs three operations: 1) propagation of particles; 2) computation of particle weights; and 3) resampling.If the current time instant is n, the approximation of the posterior is given by the particles and weights in (16).With particle propagation, we generate the particles of x n+1 .To that end, we draw them from the prior p(x n+1 |x n ).In the next step, we find the weights of the particles by simply computing wn,m = p(y n |x n,m ) and normalizing the wn,m 's so that they sum up to one.With the obtained particles and weights, we can find a point estimate of the unknown states.For example, if we want the MMSE estimate, we use In the third step, i.e., resampling, we draw particles from the existing set of particles randomly and based on their weights.With this step, we move the region of exploration of the sample space from parts that do not contain large probability masses to parts that are more relevant.After resampling, all the particles have equal weights.
As a method for indoor tracking, PF has been adopted in various settings including Wi-Fi RSS readings [49], ubiquitous computing (that exploits a commercial infrared badge system, an ultrasound TOF badge system, and a Wi-Fi device positioning system) [50], RFID [51], and UWB TOA readings aided with INS measurements [16], [52].

B. Methods for Distributed and Cooperative Tracking
In indoor environments, the presence of NLOS channel conditions, and, hence, the difficulty for a node to directly communicate with a sufficient number of anchor nodes, makes the achievement of sufficient coverage and high positioning accuracy particularly challenging.This issue can be partially overcome with distributed cooperative positioning and tracking approaches.With these approaches, the nodes cooperate to improve the knowledge about their own positions, and they conduct the tracking.That is, the task of tracking is distributed to the nodes rather than being delegated to a common central unit.Consequently, distributed algorithms are, in general, scalable, hence attractive for large networks, and intrinsically reliable.Another advantage of distributed cooperative tracking is that there is no need for all the mobile nodes to be within the communication range of multiple anchors [53].
One natural way to build decentralized schemes is to extend the classical localization/tracking schemes to the cooperative scenario, such as cooperative LS [38].On the other hand, new powerful methods belonging to belief propagation have received particular attention in recent years for distributed cooperative tracking.Belief propagation is an iterative approach, where at each iteration, the nodes obtain an approximated posterior distribution (belief ) about their own positions by means of message passing.We briefly explain it by using Fig. 3.In particular, let p x (k) n−1 |y 1:n−1 be the belief of a generic node k about its state at time n − 1 [see Fig. 3(a)].During the subsequent iteration, its neighbors use their own beliefs and measurements to compute their beliefs about x (k) n that is sent to node k through message exchange [see Fig. 3(b)].Then, node k combines all the messages to update its own belief p x (k) n |y 1:n at time n [see Fig. 3(c)].This node also computes the messages for its neighbors.The operations are iteratively repeated until convergence.
Factor graphs are efficient tools for representing the conditional dependence among the random variables of a model as happens in cooperative tracking settings [54]- [56].Factor graphs exploit the factorization of the joint distribution function of the observed measurements and the unknown locations.They are bipartite graphs composed of sets of variables (measurements and unknown locations) and factor nodes (which describe likelihoods).With these graphs, one is able to estimate the posterior distribution of the unknown locations by passing messages between the factor and variable nodes of the graph.The messages that are passed from the factor nodes to the variable nodes are likelihoods based on the measurements  connected to the respective factor nodes, whereas the messages from the variable nodes to the factor nodes are marginal posteriors of these variables obtained from the information of all the factors except that where the message is passed to.For a recent use of factor graphs for node localization, see [57]; for tracking, see [53].

C. Fingerprinting Methods
A completely different approach to tracking from those based on geometric-related measurements is fingerprinting (also referred to as mapping or scene analysis).As shown in Fig. 4, the basic idea of fingerprinting is to build a database with features (fingerprints) of the scenario at reference locations and then apply regression techniques to match the measurement and infer current position.Specifically, the fingerprinting techniques typically operate in two stages, as follows [4].
1) Offline Stage: The scenario is surveyed at known locations, and the features of the environment at each location are then recorded into a database.These features are referred to as fingerprints and could be RSS, magnetometer measurements, or any other type of position-dependent data.For instance, when RSS is considered for fingerprinting, the database is composed of the coordinates of the training location, and the RSS of the nearby BSs measured at that location, as shown in Fig. 4.
2) Online Stage: This stage refers to the process where the mobile node navigates, while sensing the same type of fingerprints that were recorded in the database.These measurements are then used to perform matching with the content of the database and provide a positioning solution for the mobile.
Fingerprinting techniques are, in general, conceived for localization purposes, but they could also be adapted for tracking.
For instance, in [58], a Viterbi-like algorithm was proposed to perform continuous tracking of the user's position from fingerprinting solutions.Many of the proposed methods that use position measurements come from the theory of pattern recognition [59], [60].The most popular algorithms are probabilistic methods [61], where the position is estimated as that which maximizes the likelihood of the target being in a certain position, given a set of possible discrete locations.This approach can be enhanced by using kernel methods and even extended to nondiscrete locations.The k-nearest neighbor algorithm [62] provides a position estimate based on the average position of the k closest training points in the database.The closeness to these k points is defined according to some adopted distance metric, e.g., the Euclidean distance.The estimate is obtained by averaging, possibly using weights, depending on the type of measurements.Furthermore, in the literature, other methods have been combined with fingerprinting.They include support vector machines [63] and neural networks [64].
A clear drawback of tracking based on fingerprinting is the reliance on training, which can be costly and time consuming.Moreover, the database should be regularly updated to account for changes in the scenario.However, the mapping techniques do not require a measurement model, and thus, they are popular due to their simplicity.

D. Methods for SLAM
SLAM is the process by which a mobile node navigates through an environment, senses it, and performs estimation of both the map of the environment and its own location within it.The birth and bloom of SLAM was in the field of robotics.This technology makes a robot autonomous and, in some sense, "self-conscious."Probably, the most popular surveys on the topic are [5] and [65], where Durrant-Whyte and Bailey reviewed the foundations, approaches, and open problems at the time of their writing.A more recent review can be found in [66].
In SLAM, the problem is to track a mobile node based on real-time range-related measurements with K L landmarks, which are basically anchor nodes deployed at unknown fixed positions p (l) n = p (l) ∀ n for the lth landmark.Here, the state of the system, i.e., x n , is composed of the mobile's state x M n and x L = p L , which includes all landmarks' states {p (1) , p (2) , . . ., p (K L ) }.Typically, in a SLAM problem, one mobile node does the surveying, although more sophisticated setups might be possible, including a larger number of cooperative mobile nodes.In a general setting, the state of the mobile node is intentionally affected by the node itself according to some navigation policy, for instance, following a predefined path.To account for this, the mobility model has to be modified by including a control signal u n used to drive the node (e.g., a vehicle) from state x n−1 to the desired state x n , that is, p(x n |x n−1 , u n ).This control vector could be a function u n = u n (x 0:n−1 ) of past estimated states x0:n−1 implementing the navigation policy.For instance, the perception model in (2) could be rewritten in the SLAM context to include this control signal as The purpose of SLAM is to find p(x n |y 1:n , u n ) = p(x M n , x L |y 1:n , u n ) and infer both x M n and x L .Notice that the location of the landmarks is unknown, in contrast to the classical tracking problem.However, the probabilistic representation allows one to treat SLAM using the general framework of sequential filtering.As a result, the mathematical tools used to solve the SLAM problem are tightly linked to those for the tracking problem discussed in Section IV-A.The main idea behind SLAM is shown in Fig. 5. There, we see a mobile node and its trajectory passing among three landmarks (with fixed locations p (1) , p (2) , and p (3) ).At time instant n − 1, the mobile node constructs , where x L is the set {p (1) , p (2) , p (3) }, and to that end, it uses all the measurements y 1:n−1 and its control signal u n−1 .At time instant n, the node receives new measurements y n , and it updates the posterior to p x M n , x L |y 1:n , u n .There is a vast literature on the so-called EKF-SLAM approach with different implementations and information sources [67].There is also considerable literature on PF approaches where challenging nonlinearities and non-Gaussianities are treated.A prominent PF solution is the celebrated FastSLAM [68].FastSLAM takes into account the conditional linear structures in the model to reduce the dimensionality of the problem [47].Common challenges associated to SLAM include wrong data association among observations and landmarks [69], [70] and the close-the-loop problem, which occurs when a landmark is reobserved after a large period.When the latter happens, in general, it is hard to make an association to the same landmark, particularly when using visual aids to detect features.Other challenges are related to the applied methods.For example, the solutions based on the EKF typically suffer from linearization errors [71], [72], whereas the solutions based on the PF suffer from particle depletion [73].Furthermore, some works point out that the static nature of the locations of the landmarks might cause the PF to diverge [74].Finally, the computational demands of SLAM techniques are generally high [75].
Visual aids are typically used as the primary source of information in robotics, and thus, one resorts to image processing techniques to determine the relative location of the mobile node to landmarks.See [76] for a thorough introduction to the Visual SLAM topic.In recent years, due to the widespread deployment of Wi-Fi access points, the SLAM research community has worked on exploiting the Wi-Fi infrastructure.The underlying idea is to achieve SLAM by range measurements computed using RF signals received by the mobile node.The most popular approach is most likely Wi-Fi-SLAM [77].This approach builds on Gaussian process latent variable models to reduce the dimensionality and determine the latent-space locations of unlabeled signal strength data.An extension of this work considers the use of GraphSLAM for computational reduction and removing the signature uniqueness assumption [78].Foot-SLAM [79] proposes to mount an inertial unit on pedestrians to perform SLAM based on its measurements and a Bayesian filter.FootSLAM was extended to incorporate sporadic observations from other sensors (e.g., RFID or camera) and to combine data from multiple pedestrians collaboratively.These approaches are termed PlaceSLAM and FeetSLAM, respectively.The use of Wi-Fi signals in FootSLAM was treated in the so-called WiSLAM algorithm [80].It integrates RSS measurements of the communication network and the foot-mounted inertial sensors.

E. Fusion Methods
In many settings, there are measurements acquired by different types of sensors.With fusion techniques, the information from these measurements is combined to improve on the tracking when only a single type of measurements is used.In general, when one employs probabilistic models and Bayes' theory, with the assumption of conditionally independent measurements, optimal fusing is known as an independent likelihood pool.Suppose the mth node has different types of sensors with measurements about node k.Let the measurements of these sensors be denoted by y (k,m) j n , j = 1, 2, . . ., J m .Then, if the measurements are conditionally independent, the perception model in (12) can be written as If the conditions of independence are not met, there are other ways of combining information from different sensors.For example, one of them is based on attaching a weight to the information provided by each sensor and linearly combining the likelihoods, i.e., p y (m,k) where α (m,k) j are the weights assigned to the sensors, and As an example of fusion, we explain how inertial measurements, discussed in Section III, can be integrated with measurements of other type.There are several options for integrating them, and they mainly differ on the level of integration.
1) Loose Integration: The simplest integration consists in fusing the position estimates of the strapdown INS with the position estimates of another geometric-based system to form a final integrated position solution.The resulting estimate is better than the estimates obtained from the individual systems.In the literature, this is referred to as loose integration.
2) Tight Integration: Loose integration could be further improved if the information of the INS is taken into account when computing the geometric-based position solution.This is referred to as tight integration in the literature, as there are no independent solutions for both systems but a single blended navigation solution.This approach is known to be more robust and reliable than loose integration.For instance, a tight integration of INS data and a technology supporting one of the aforementioned distance-based measurements could consist in plugging the acceleration measurements in the state equation, while introducing the sensor biases on the state vector.Data fusion is then readily performed by the filtering methods discussed earlier in this section.
As another example of fusion, suppose that a system needs to track several targets with two types of sensors, i.e., one that provides information about the ID of the targets, referred to as ID sensors, and another that has accurate ranging (e.g., laser range finders).When the targets are far away from the ID sensors, the system tracks the targets but does not know their IDs.When these targets get close to the ID sensors, they get identified.Thus, the system needs information from both sensors to perform as desired.When the targets leave the area covered by the ID sensors, the system may lose its identity.This is known in the literature as the data-association problem.
There are many recent papers on fusion methods for indoor tracking.They address different applications (e.g., tracking of moving human targets [81] or navigation in emergency scenarios [82]) and present fusion of different types of measurements (e.g., radio-based ranging and speed-based sensing measurements [83], TOF wireless ranging and microelectromechanical (MEMS) inertial sensor measurements [84], inertial sensor and Wi-Fi measurements [85], or various signals of opportunity [86]).

F. Other Methods
Here, we briefly describe some other methods to solve the tracking problem.Details can be found in the referenced papers.
While multipath usually represents a detrimental effect on the accuracy of positioning systems, it can be usefully exploited to reduce the number of anchors, as done in [87].Specifically, through the use of floor plan information, signal reflections are interpreted as originating from the so-called virtual anchors, which can be used to resolve the position ambiguities.The effectiveness of this approach has been demonstrated in [87] in a cooperative tracking setting by formulating the problem with factor graphs and using belief propagation.
As stated in Section III-A6, position information can also be inferred from simple connectivity information.One method is counting the number of hops necessary to reach a specific node starting from another node in the network and derive a rough estimate of the distance by evaluating the average hop length, as proposed in the DV-hop algorithm [88].A different approach is represented by range-free localization algorithms.
Here, the problem is to find a node's position such that all, or most, proximity constraints are satisfied.As the number of constraints increases, the feasible region of solutions for the nodes' position, given by the intersection of individual constraints, becomes smaller [89], [90].
Another methodology that exploits RSS is known as RF tomography.It is based on RSS measurements of RF transmissions between multiple sensor nodes.If there are targets near the LOS path between two nodes, they cause changes in the RF signal.One exploits the changes or the absences of changes of the RSS to estimate the location of a target and/or to perform tracking.It has been shown that RF tomography is promising in that one may accurately track several targets simultaneously [91].
Swarm optimization is a bioinspired technique for multidimensional optimization.It has been tailored to address also problems in WSNs including optimal deployment, node localization, and tracking.Swarm optimization is iterative in nature and is based on trying a set of candidate solutions to the problem and modifying them with iterations according to predefined rules.For a recent survey of their use in WSNs, see [92].

V. TECHNOLOGIES FOR INDOOR TRACKING
The number of technologies for indoor tracking is large; they include lasers, computer vision, sonar, and infrared.Here, due to lack of space, we specifically focus on those based on radio signal exchange.
Most wireless standards have been designed and optimized having in mind data and voice communication services but not positioning and tracking.Only recently, wireless communication standards have started to take into consideration the support of dedicated positioning and tracking capabilities.The interest of the mobile industry to accelerate the adoption of indoor position solutions turned into the foundation of the InLocation Alliance (ILA, inlocationalliance.org).The goal of the alliance is to facilitate a rapid market adoption of RTLS so that new business streams are opened up with context-aware applications in indoor environments.The ILA chose Wi-Fi and Bluetooth as the preferred technologies.
Table I presents a qualitative comparison of the available technologies with a brief description of their main characteristics.The numbers are only indicative as they strongly depend on the environment and its variability.It can be noted that there is no technology that is able to provide satisfactory performance in terms of positioning accuracy, coverage, and infrastructure cost in all the environments.As a consequence, at least in the short term, we will most likely see the adoption of single technologies in niche applications and mix of technologies via fusion methods in consumer applications.

A. Short-Range Wireless Technologies
Over the past 20 years, we have witnessed the proliferation of many standards that offer short-range wireless connectivity, each of them targeting a different application segment.The most widely used are Wi-Fi for wireless local area network applications, RFID and NFC for item/object identification, Bluetooth for wireless personal area networks, and IEEE 802.15.4/ZigBee enabling WSNs [93].These standards were not conceived for localization and tracking purposes, and therefore, localization/tracking services can only exploit connectivity information or RSS and phase measurements, as illustrated in Section III.Consequently, the corresponding positioning performance could be rather poor.Moreover, nodes are often deployed randomly or with the objective of coverage of communication services.This generates the necessity to support the network with suitable tracking algorithms, for example, exploiting cooperation between nodes or signals of opportunity as described in Section IV.In large networks (e.g., WSNs), it is of paramount importance that the complexity of the positioning algorithm is scalable with the number of nodes and/or the connectivity level of the network [94].

B. Cellular Networks
Cellular networks rely on a set of BSs, with a coverage radius from a few meters to about tens of kilometers.Historically, the first example of location service offered by cellular systems is the E-911 introduced for emergency calls in the United States [95].The simplest but very inaccurate way to get location information is through the cell ID from which the user equipment is served (i.e., proximity).As a consequence, the localization accuracy is of the order of the cell size.Potentially, 2G/3G cellular physical (PHY) layer can provide ranging information through signal TOA estimation [more precisely, observed TDOA (OTDOA)], although the relatively small bandwidth and signal structure limit the achievable time resolution (1 μs for Global System for Mobile Communication (GSM), about 200 ns for 3G systems).Current location estimation algorithms try to exploit any available information about the environment (e.g., fading conditions, Doppler frequency, and network topology) to attain higher accuracy through data fusion methods.
In the last decade, cellular network standard protocols have allocated resources to carry GNSS assistance data to GNSS-enabled mobile devices, to implement assisted-GPS/ assisted-GNSS services in both GSM and Universal Mobile Telecommunications System networks [96].The 3GPP introduced location services in LTE Release 9, published in December 2009 [97].LTE technology offers a tight synchronization between BSs and the possibility to use wideband signals with low interference.This standard specifies a dedicated support for positioning.With it, a significant performance improvement with respect to previous cellular network generations is expected with positioning accuracy better than 20 m for 50% of the cases and 63 m for 95% of the cases using OTDOA [98].The forthcoming 5G mobile communication standard is expected to embed high-positioning capabilities due to the adoption of small cells and massive antenna arrays at millimeter waves [99].

C. UWB Technology
This technology has generated considerable and growing interest since February 2002, when the Federal Communications Commission opened up 7.5 GHz of spectrum (from 3.1 to 10.6 GHz) for use by UWB devices [100].The traditional design approach for a UWB communication system uses narrow time-domain pulses of very short duration, typically on the order of a nanosecond, thereby spreading the spectrum of the transmitted signal over a wide frequency band larger than 500 MHz.This method is usually called impulse radio UWB.A great advantage of the short pulse modulation is the possibility to estimate the TOA with a fine resolution, which translates in ranging estimation with few centimeter accuracy [101].Therefore, UWB is promising for high-definition indoor tracking [35], particularly after the publication of the first UWB-based standard IEEE 802.15.4a specifically addressed to WSN applications as a low-cost and low-power PHY-layer substitute of the IEEE 802.15.4 PHY [3].The first commercial IEEE 802.15.4acompliant chip set has been delivered in 2014.In parallel, several companies proposing proprietary UWB products for RTLS have been deeply involved in the development of the new IEEE 802.15.4f standard, which is devoted to specifying a solution for precise indoor positioning and tracking using lowcost and low-consumption tags [102].

D. Near-Field Technology
Here, we briefly mention the near-field electromagnetic ranging (NFER)-based technology that adopts low frequencies (typically around 1 MHz) and, consequently, long wavelengths (about 300 m) [103].The key idea of this solution is to exploit the deterministic relationship that exists (in free space) between the angle formed by electric and magnetic fields of the received signal and the distance among the transmitter and the receiver when operating in near-field propagation conditions.This lowfrequency approach to ranging provides good obstacle penetration and multipath resistance.The main drawback of NFER is due to the large antennas required and the scarce energy efficiency of the corresponding wireless devices.

VI. CONCLUSIONS AND OUTLOOK
In this paper, we have described the problems of indoor wireless tracking and mapping.We provided a mathematical formulation of the problems and showed their solutions from a signal processing perspective.We reviewed the performance metrics and elaborated on the types of measurements that are used to reach the stated objectives.The main methods for solving the postulated problems, including those of fusion, were illustrated.The emphasis in the survey was given to the Bayesian approaches.Finally, we also provided a brief review of available technologies.
In the years to come, we will see further increase of research for indoor tracking.This research will be motivated by the development of new technologies and the introduction of new applications.It will not be surprising if we witness a widespread use of indoor tracking technologies to complement and empower pedestrian and vehicular systems in the fields of intelligent transportation systems, automated vehicles, robotics, and locationbased services.An important line of work will remain the fusion of information that comes from existing infrastructure, such as signals of opportunity and information that will be provided from newly deployed systems or collected by the growing pervasive presence of smartphones (crowd sensing) [104].
Some of the described methods in this paper are computationally intensive.The reason for this is that they are, by design, ambitious in the sense of their capacity to extract a large amount of information from available data.An important challenge is then to bring these methods to a somewhat reduced form that would allow for their practical use in a wider range of realworld applications.
There is no doubt that the appearance of novel technologies will continue to drive the research in localization and tracking.One of them is the upcoming Internet-of-Things (IoT).The IoT will become a very large network of devices, sensors, and objects that will be connected through communications so that novel value-added services will be provided.Furthermore, the network will be dynamic and will find many applications in various indoor settings including smart homes and smart buildings.It goes without saying that an important piece of information in many of the applications will be the tracking of the "things" in the network.In the IoT, most of these operations will require implementations in a distributed way, that is, without a particular central unit in place.Within this scenario, another important challenge is the development of networks able to identify and track low-cost and energy-autonomous devices (tags) attached to objects/persons.This will require the design of energyefficient or zero-power solutions (e.g., using passive tags) toward the full integration of RFID and RTLS technologies [8], [105].This, in turn, will accelerate developments of localization and tracking by signal processing methods over networks.

Fig. 1 .
Fig. 1.System with anchor and mobile nodes.Time indexes are not shown for notation simplicity.
example, in one setting, x (k) n corresponds to the node's position and velocity in 2-D or 3-D coordinates, i.e., x (k) n includes the node's position p (k) n and the node's velocity ṗ(k) n .In another setting, apart from the position and velocity, x (k) n may include the orientation of the mobile node θ (k) n .Denote with x n the global state of the system at time step n composed of the states of all the nodes x (1) n , x

Fig. 4 .
Fig. 4. Example of fingerprinting using RSS measurements from Wi-Fi access points.

Fig. 5 .
Fig. 5. Illustrative example of SLAM where a robot is moving while measuring RF signals transmitted from three landmarks.At each time step, the robot estimates its own state and the landmarks' location from the filtering distribution p(x M n , x L |y 1:n , un).

TABLE I COMPARISON
OF EXISTING POSITIONING SYSTEMS