Self-Learning Detection and Mitigation of Non-Line-of-Sight Measurements in Ultra-Wideband Localization

—Non-line-of-sight (NLOS) propagation is one of the main error sources in indoor localization, so a large body of work has been dedicated to identifying and mitigating NLOS errors. The most accurate NLOS detection methods often rely on large training data sets that are time-consuming to obtain and depend on the environment and hardware. We propose a method for detecting NLOS distance measurements without manually collected training data and knowledge of channel statistics. Instead, the algorithm generates LOS/NLOS labels for sets of distance measurements between ﬁxed sensors and the mobile target based on distance residuals. The residual-based detection has 70–80% accuracy but has high complexity and cannot be used with high conﬁdence on all measurements. Therefore, we use the predicted labels and the channel impulse responses of the measurements to train a classiﬁer that achieves over 90% accuracy and can be used on all measurements, with low complexity. After we train the classiﬁer during an initial phase that captures speciﬁcs of the devices and of the environment, we can skip the residual-based detection and use only the trained model to classify all measurements. We also propose an NLOS mitigation method that reduces, on average, the mean and standard deviation of the localization error by 2.2 and 5.8 times, respectively.


I. INTRODUCTION
Indoor localization has garnered attention in recent years for its useful applications such as navigating in public spaces, offering customized location-based services and interactions with the environment, or controlling and monitoring industrial robots and indoor drones. One popular localization method uses several sensors (called anchors) with fixed and known positions that communicate with a mobile target (called tag). The method estimates the location of the tag based on time or distance measurements between each anchor and the tag. Non-line-of-sight (NLOS) propagation, in which an object or a person obstructs the direct path between two devices, affects most localization methods. In this case, the observed time of flight (TOF) or distance between two devices is larger than without the obstacle, which causes a localization error. This work was supported by funding from European Union's Horizon 2020 Research and Innovation programme under the Marie Skłodowska Curie grant agreement No. 813278 (A-WEAR: A network for dynamic wearable applications with privacy constraints, http://www.a-wear.eu/). The work was also partly supported by a grant from the Romanian National Authority for Scientific Research and Innovation, UEFISCDI project PN-III-P3-3.6-H2020-2020-0124.
Detecting and correcting NLOS distance errors has been widely studied in the literature [1], [2], [3], [4], [5], [6]. The methods with the highest detection accuracy rely on the statistics of the channel impulse response (CIR) of the signal [2], [3], [4]. However, they require extensive measurement campaigns to learn the statistics of LOS and NLOS measurements. Such measurements are time-consuming, require some expertise, depend on the environment and on the used hardware, and need to be repeated frequently in order to capture the environment dynamics. Therefore, collecting training data before every deployment and maintaining databases up-to-date are demanding tasks, which are usually infeasible in practice.
In this work, we propose an NLOS detection and mitigation method that does not require manually-acquired training data nor channel statistics. Fig. 1 shows the main steps of our approach. When a tag is first deployed in an area, it starts the initial phase of the algorithm, in which it collects distance measurements and CIRs from all the anchors in the area. In this first step, the measurements are labeled as LOS, NLOS, or ambiguous using residual analysis. If there are more anchors than the minimum necessary for 2D or 3D localization, we can compute locations using each subset of anchors. Since NLOS anchors 1 introduce higher location errors, they also have higher distance residuals, defined as the difference between the measured distance and the Euclidean distance between the anchor and the estimated position of the tag. The method labels anchors as LOS/NLOS if their average residuals can be grouped in two one-dimensional clusters (intervals). The labeling step has an acceptable accuracy (70-80 %) but it cannot be used on all sets of measurements, since for small NLOS errors the separation between LOS and NLOS residuals is not clear. Moreover, the residual-based labeling step requires computing the location using all anchor combinations, which scales with O(2 N ) for N anchors.
Therefore, we introduce the second step, in which we train a Random Forest (RF) classifier using the labels predicted in the first step and the CIR features of the measurements. The model can recover the correct class boundary even with noisy labels, reaching a higher classification accuracy (>90 %) than the residual-based labeling. After the model is trained, we can directly classify all distance measurements as LOS/NLOS Step 1: Residual-based Labeling Step 2: Machine Learning Model Training Step 3: NLOS Error Mitigation using the RF and skip the residual-based labeling. Classifying samples using RF has a constant complexity and can be used in all localization instances. We further propose a location-correction method based on identified LOS/NLOS measurements which does not discard NLOS measurements. We evaluate the accuracy and localization error of the proposed method on a database of measurements acquired with UWB devices, which therefore resembles a real localization setup.
II. RELATED WORK NLOS identification methods based on channel statistics have been well studied in literature [2], [3], [4] and they achieve very high classification accuracy (>90 %). However, they need extensive measurement campaigns to collect training data, which are rarely feasible in practice.
NLOS errors can be directly mitigated using semi-definite programming [5], [7] without assuming any measurement statistics. However, these methods are usually more computationally expensive than plain localization algorithms [5]. Other methods that do not require error statistics use additional hardware, such as inertial measurement units [8].
In [9], the authors proposed a NLOS mitigation technique for dense NLOS environments that does not need training but it assumes the measurement variance to be known. The error was corrected using two extended Kalman filters (EKFs) alternatively depending on the LOS/NLOS condition.
In [10], the authors propose an unsupervised NLOS identification method. The biggest difference from our approach is that the method in [10] can classify data only in bulk (so not online) because it needs a collection of data points to obtain the distribution of features with Gaussian mixture models.
In [11] and [12], the authors proposed NLOS identification methods based on pre-trained models. In [11], a convolutional neural network (CNN) is trained in one environment and updated with data from a new environment. The method is validated in two similar office environments, so it is not clear how well the model can be transferred between two very different environments, e.g., a mall and an office. This is also an open research question for our method. However, we do not rely on a pre-trained model but can train it online. In [12], a pre-trained model is improved by retraining using unlabeled samples. This approach can be used to improve the accuracy of our model (discussed in Section VI).
The closest work to ours is [6], where the authors used anchor residuals instead of CIR features to train a classifier. The simulated error for NLOS measurements was sampled from a uniform distribution between 0.75-3.5 m, so NLOS measurements were easily distinguishable from LOS ones, which had zero-centered normally-distributed errors. However, in our measurement campaigns (desribed in Section IV), we found that typical NLOS errors with UWB devices are spread between a smaller range of 0.25-0.8 m, so it was harder to accurately identify NLOS measurements using only anchor residuals. Therefore, we propose a classification in two steps: first using anchor residuals, then using CIR features.
Another work that applied residual analysis to identify NLOS errors is [13]. The authors used residual analysis to identify NLOS errors, a voting algorithm to correct these errors, and a fuzzy C-means algorithm to classify NLOS errors into "hard" and "soft" NLOS. A Kalman filter (KF) and an unscented Kalman filter (UKF) filtered the two types of NLOS errors and corrected the location estimates. The authors in [13] focused more on NLOS error mitigation than on NLOS identification and they did not mention the accuracy of the classification method alone. Compared to [13], we also provide a NLOS detection method which can be useful in detecting obstacles, creating building maps, or estimating crowd densities. We show that we can also reduce localization errors with our NLOS detection method.

III. NON-LINE-OF-SIGHT DETECTION
In this section, we present the basics of anchor-based localization (Section III-A) and the main steps of the unsupervised NLOS detection method, also highlighted in Fig. 1. First, we use residual analysis to obtain LOS/NLOS labels using only distance measurements (Section III-B). By repeating this step for multiple locations, we create a database of CIR features and their predicted labels. These are given as training data to an RF classifier (Section III-C). Once trained, the classifier can directly classify subsequent measurements.

A. Localization
Range-based localization estimates the coordinatesx of a target (also called a tag) using the distance measurements between the tag and N anchors with known locations x Ai , where A i is the i th anchor, for i = 1, ..., N . When the direct path between two devices is unobstructed, also known as lineof-sight (LOS) propagation, the distance between two devices can be recovered as d = c · TOF, where c is the speed of light. If an object or person blocks the direct path, the signal usually travels at a lower speed than through the air. This causes a larger TOF than without the obstacle and the estimated distance is larger than in reality. Obstacles can also completely block the direct signal; in such cases, reflections which travel longer paths than the direct signal can arrive at the receiver and also cause time delays. The last two scenarios are known as non-line of sight (NLOS) and cause distance and localization errors in UWB localization.
The anchor-tag distances can be written as: where x is the true location of the tag, · is the Euclidean norm, and v i is a noise term. In vector form, this becomes: where y is a vector which contains all distance measurements d i , i = 1, ..., N , v is the error vector, and h is a vector-valued measurement function. The location can be found using the least squares method: We chose the regularized Gauss-Newton multilateration algorithm from [14], because it has low computational complexity and localization errors comparable to closed-form solutions. The algorithm needs an initial location, which can be obtained with a closed-form solution. For each iteration k, the algorithm computes the Jacobian matrix: The solution at each iteration is x k+1 = x k + ∆x and ∆x k is the least-squares solution to: where I N is the unitary matrix of size N × N , x r is a regularization location and c is a regularization coefficient equal to the inverse of the standard deviation of a distribution centered at x r . The algorithm stops when the norm of the incremental location is smaller than the tolerance δ or when it reaches the maximum number of iterations k max .

B. Unsupervised Labeling with Anchor Residuals
If the distance measurements d i are noiseless, i.e. v i = 0 in Eq. (1), then in the 2D case the tag is found at the intersection between circles centered at the anchors, with a radius equal to d i , for i = 1, ..., N (see Fig. 2a). If the distance measurements are noisy, the circles do not intersect in a single point anymore and the tag's location is (ideally) found inside the intersection area of the circles, as shown in Fig. 2b. The residual of anchor A i is defined as [1]: where d i is the measured distance between anchor A i and the tag and d i is the distance between the anchor and the estimated locationx of the tag. Intuitively, an anchor's residual is likely to be higher when the anchor is in NLOS with the tag, since this causes a higher distance measurement error [1]. For M -dimensional localization, at least M + 1 anchors are needed to solve the system of equations from Eq. (2) In the ideal (2D) case, the tag is found at the intersection point between the circles centered at the anchors' locations with the radius equal to each anchor-tag distance. When the measured distances are noisy, the circles do not intersect in a single point anymore. An anchor's residual r i is the difference between the measured distance d i and the distance between the estimated location and the anchor's location d i .
where | · | denotes the cardinality of a set and r i,S is the residual of the i th anchor in this subset. In [1], the final location estimate is obtained as the weighted average of the intermediate locations obtained in all subsets, where the weight is the inverse of a subset's residual.
Starting from the insight that anchor residuals are higher when anchors are in NLOS with the tag, we propose using the average residual of an anchor, computed over all subsets in which it is used, to identify NLOS anchors: Because NLOS anchors have higher distance measurement errors, their average residuals are also usually higher than those of LOS anchors. In many cases, residuals coming from LOS and NLOS anchors form two 1D clusters (or intervals) which can be separated. Fig. 3 shows such an example, for a simulation of eight anchors when the tag and some anchors are separated by a wall. The wall introduces a lognormallydistributed error with a median of 24 cm and a standard deviation of 1.8 m (the parameters were obtained from a measurement campaign [15]). In many cases, the average residuals of LOS and NLOS anchors can be clearly delimited.
To find the threshold that separates LOS from NLOS residuals, we use kernel density estimation (KDE) to get the distribution of the average anchor residuals (for a given location). The kernel density estimator of a series of independent and identically distributed samples {R 1 , ..., R N } is: where K is a non-negative function called the kernel (we used a Gaussian kernel) and h > 0 is a smoothing parameter. Fig. 4 shows three examples of KDE applied on anchor residuals, for h = 0.08. When LOS and NLOS anchor residuals form two different clusters, the distribution has two maxima and a single minimum, like in Fig. 4a. In this case, we label anchors whose residual is higher than the minimum as NLOS and the rest as LOS. There can also be ambiguous cases. In Fig. 4b, the residuals are uniformly spread over the interval and the distribution has only one maximum. In Fig. 4c, on the other hand, there are more than one local minima found. The parameter h determines the smoothness of the fitted distribution. If h is too small, the estimated distribution will contain spurious data artifacts, similar to the distribution in Fig. 4c. If h is too high, the estimator cannot capture the underlying data structure, leading to an estimate similar to Fig. 4b. Therefore, too large or too small of a smoothing parameter leads to ambiguous cases in which the data cannot be labeled. Because we aim for a high LOS/NLOS detection accuracy, we label only unambiguous cases in which the distribution has exactly one minimum. In the current implementation, we use the ambiguous cases only for validation but in the future we could use them to retrain the RF model and increase its accuracy (see the discussion in Section VI). In Section III-B, we will discuss the choice of h which maximizes the labeling accuracy and the percentage of instances classified.
Because not all anchor residuals can be unambiguously split into two intervals, we cannot apply the residual-based labeling on all measurements. Also, we need to compute locations using all anchor subsets, which scales with 2 N . Therefore, we use the labels provided by the residual method to train an ML model that can classify all distance measurements.

C. Model Training
The identification and mitigation of the LOS/NLOS condition using features based on the CIR of the signal has been studied in [2], [3], [4]. Supervised classification methods have very high accuracy (> 90%) but need training data, i.e., a database of distance measurements labeled as LOS/NLOS and their CIRs. We replace the manual labeling with the unsupervised labeling based on anchor residuals, as described in Section III-B. Because our training set has label noise (LOS measurements labeled as NLOS and vice-versa), we want to train a machine learning model robust to label noise. We chose a Random Forest (RF) classifier, since it is an ensemble machine learning (ML) algorithm that performs well with noisy labels [16]. The RF is a collection of decision trees which outputs the class predicted by most of the individual decision trees through bootstrap aggregating.
The purpose of this paper is not to find the best ML algorithm for the task, but to demonstrate the general idea, that we can detect NLOS measurements without training data using residual-based labeling. We leave as future work an exhaustive search through more ML models suitable for data sets with noisy labels that further increase the NLOS detection accuracy. We train the model using CIR features known to characterize well LOS/NLOS conditions [2], [3]: the energy of the received signal, the maximum amplitude of the signal, the mean excess delay, the RMS delay spread, the kurtosis, and the difference between the TOA and the time at which the signal has the maximum amplitude (∆T (TOA, Max)).

IV. EVALUATION SETUP
We simulate a localization scenario based on a database of real UWB measurements to evaluate the feasibility and performance of the proposed method for NLOS error detection and correction. Fig. 5 describes the simulation flow. We start from a setup with an area of 9×20 m and N ∈ {5, 6, 7, 8, 9} anchors distributed approximately uniformly on the perimeter of the area. When N is odd, one anchor is in the center of the area, while the others are on the perimeter. We consider a grid of ≈ 1700 true locations of the tag spaced 20 cm apart within the area encompassed by the anchors. In a real deployment with a reasonable location update period of 100 ms, 1700 locations could be obtained in under 3 min. For each true location of the tag, we choose at random N * Q anchors to be in NLOS with the tag, where Q ∈ {0, 0.3, 0.5}.  We simulate the distance measurements between each anchor and the tag by adding to the true distance a distance error selected from a measurement database, based on whether the anchor is in LOS or NLOS with the tag. We also store the CIR corresponding to the selected measurement. Although we could simulate the distance errors for each obstacle based on proposed models [15], it is harder to simulate the CIR for a particular type of obstruction. Therefore, for a realistic setup, we preferred selecting the distance error and CIR of a real measurement from a database. We use a database of distance measurements and CIRs acquired with UWB devices developed by 3db Access, which was partly described in [15]. The measurements were acquired in LOS, in NLOS with human body shadowing, or in NLOS with concrete wall shadowing at various indoor locations. The NLOS database aggregates the measurements with both types of obstructions. TABLE I shows the number of measurements acquired in each scenario and the range of covered distances.
We feed the N distances to a localization engine, which computes the 2D location and average anchor residuals over all anchor subsets. For localization, we used the Gauss-Newton multilateration algorithm strengthened with a regularization term. We initialized the algorithm with δ =1 mm, k max = 10 iterations, x r = the median of the anchors' locations, and c = 10 −1 (corresponding to a standard deviation of 10 m around x r , suitable for our setup).
The residual analysis block receives as input the average anchor residuals and predicts the labels of each anchortag measurement. The label can be either LOS, NLOS, or ambiguous (in case the density of the anchor residuals does not have exactly one minimum). We repeat the procedure for M locations and build a database of M * N predicted labels and the corresponding CIR features.
We split the database into a training set, which contains the measurements predicted as LOS/NLOS, and a test set, which contains the ambiguous measurements. We use the training set to train a Random Forest classifier, which learns the LOS/NLOS CIR features based on the labels predicted by the residual analysis. Once the model is trained, we can use it to directly classify all measurements, without going through the residual analysis procedure. Finally, we mitigate NLOS measurements and reduce the localization error.

V. EVALUATION
We now evaluate the performance of the residual-based labeling (Section V-A), of the trained RF classifier (Section V-B), and of the NLOS mitigation method (Section V-C).

A. Residual-based LOS/NLOS Labeling
The first step is the unsupervised labeling using anchor residuals, described in Section III-B. We can alternatively formulate the labeling as a detection problem, where the detected event is a NLOS measurement. To evaluate the performance of the labeling method, we use the balanced accuracy, which is the arithmetic mean of the true positive and true negative rates (TPR and TNR, respectively): TP is the number of true positives, TN is the number of true negatives, FP is the number of false positives, and FN is the number of false negatives. Because we label only cases where the density estimate of the anchor residuals has exactly one minimum, we are also interested in the percentage of instances labeled (denoted by P C ), defined as the number of locations for which the algorithm provides a label for all anchors. This ratio and the accuracy depend on the KDE shaping parameter (h). If h is too small or too large, the estimation is oversmoothed or undersmoothed, respectively, resulting in few classified instances (because the distribution has either too many or no minima at all). Therefore, we want to find the KDE shaping parameter which maximizes the accuracy and P C . Fig. 6 shows the balanced accuracy and percentage of classified instances against h for 30 % and 50 % NLOS anchors. We omitted the case when all anchors are in LOS with the tag because less than 1 % instances are classified in this case. This is desirable for the RF classifier because it keeps the number of LOS and NLOS measurements balanced. Fig. 6a and 6b show that h changes the balanced accuracy with at most 15 % for Q =30 % NLOS anchors and at most 10 % for Q =50 % NLOS anchors. On the other hand, Fig. 6c  and 6d show that h has a marked impact on the percentage of classified instances. P C is the highest for h = [0.03, 0.06] (depending on the number of anchors) and decreases for values outside the interval. Because, for different values of h, the change in accuracy is small but the change in P C is large, we want a performance score based on these two metrics which increases the impact of P C . Therefore, we chose as the aggregated performance score the harmonic mean between the balanced accuracy and P C .
In practice, we usually have mixed NLOS conditions which can change over time. Therefore, we extend the performance score to be the harmonic mean of the balanced accuracy and percentage of classified instances for both Q = 30% and 50% and choose the KDE shaping parameter which maximizes this score. The optimum shaping parameter is h = 0.04 for N = 5 to 8 anchors and h = 0.03 for N = 9 anchors. TABLE II shows the LOS, NLOS, and balanced accuracy and the percentage of classified instances for the optimum h.
The classification accuracy is higher for 30 % NLOS anchors than for 50 %. When more anchors are in NLOS with the tag, the location estimate is more skewed and the residuals of all anchors (not only of NLOS anchors) are larger. In this case, LOS and NLOS anchor residuals are harder to distinguish.
We note that the accuracy slightly decreases for more anchors. This is because, with more anchors (out of which only a few are in NLOS), it is harder to find a value of h low enough to delimit the few NLOS anchors from the LOS ones but high enough to avoid an oversmoothed distribution that leads to more than two intervals.

B. Supervised Classification
We now evaluate the accuracy of the RF classification applied on measurements labeled with residual analysis. The training set consists of labeled measurements and their features. We aggregated the measurements for Q = 30 % and 50 % NLOS anchors, since in practice we can have mixed NLOS conditions. We train a classifier for each number of anchors. The test set contains all measurements which were not labeled in the previous step, i.e., where the density of anchor residuals did not have exactly one minimum. The training set has approximately 6, 500-10, 000 samples and an almost equal number of LOS and NLOS samples. We use stratified K-fold cross-validation with K = 4 folds to identify the best model's parameters from a specified subset. Fig. 7 shows the LOS, NLOS, and balanced accuracy of the model. The NLOS detection accuracy slightly exceeds 90 % in all cases, while the LOS accuracy exceeds 95 %. Compared with only the residual-based labeling, we gain 10-20 % accuracy. It is perhaps surprising that the RF classification accuracy exceeds 90 % even when the labeling accuracy can be as low as 63 %. This is because noisy labels resemble outliers or anomalies and ML models can usually recover to a certain extent the correct class boundaries [16].

C. NLOS Mitigation
We now devise a strategy for handling NLOS measurements in order to improve the localization accuracy. We present the NLOS mitigation procedure as a pseudocode in Algorithm 1 and describe it in the following.
If there are enough LOS anchors to compute one location (at least D + 1 anchors for D dimensions) and the set of anchors is not degenerate (i.e., the anchors are not collinear), we can use only the LOS anchors to compute the location. for i = 1, ..., N do Distance correction 8: if A i in NLOS then 9: However, if there are few LOS anchors and their placement is not ideal (for instance, the tag falls outside the convex hull of the anchors), the location estimated using only the LOS anchors can sometimes have large errors. Therefore, we noticed that we obtain better location estimates if we correct NLOS measurements and use them for localization. For correction, we first estimate the intermediate location using only the LOS anchors. We compute the residuals of the NLOS anchors based on the intermediate location. Then, we subtract the residuals from the measured distances of NLOS anchors. We estimate the final location using the distance measurements of LOS anchors and the corrected distances of NLOS anchors.
If the set of LOS anchors is degenerate or there are not enough LOS anchors to compute the location, we must use some NLOS anchors to compute the tag's location. Because we cannot correct the NLOS measurements as in the previous case, we generate all subsets S k containing all LOS anchors and all combinations of NLOS anchors such that |S k | ≥ D+1. For each subset, we compute the intermediate location and the subset's residual using Eq. 7. The final location is the weighted linear combination of all intermediate locations, similar to the method proposed in [1], except that we do not use all possible We compute the localization error as the Euclidean distance between the estimated locationx and the true location x: We evaluate the localization error on the test data set from Section III-C. For each set of anchor-tag measurements, we predict the LOS/NLOS condition using the trained model and then apply Algorithm 1 to mitigate the NLOS errors. TABLE III compares the localization errors obtained with the proposed mitigation algorithm (denoted by "proposed") with those obtained when using all anchor-tag distances, without mitigation. The localization errors of the proposed method have 1.8-2.8× smaller mean and 1.8-11.6× smaller standard deviation after NLOS mitigation. On average, the algorithm reduces the mean and standard deviation by 2.2 and 5.8 times, respectively. Therefore, the proposed method can successfully mitigate localization errors caused by NLOS propagation.

VI. DISCUSSION AND FUTURE WORK
One remaining question is whether it is worth deploying more anchors than the minimum necessary (e.g., three anchors for 2D localization). In buildings with rooms separated by thick walls, if we want to provide high-accuracy location services everywhere, we have to deploy at least three anchors in each room. However, the tag can still be in the range of anchors in other rooms, so it has to decide which anchors are in LOS. Even within the same room, some anchors might be shadowed by surrounding objects, so it is worth having extra anchors. In TABLE III, we see that when we correct NLOS errors, the localization error decreases for more anchors. Therefore, more anchors than the minimum are often needed in practical deployments.
So far, we have not discussed which entity should train the classifier: the localization engine (LE) or the tag. There are arguments for both sides. On the one hand, NLOS errors depend on the environment in which the LE operates. For instance, in an industrial setting with metallic objects, NLOS errors might be larger than in an office. Therefore, the LE could collect anchor-tag distances from tags operating in an area and train a model which can then classify all measurements at the LE or the tag, depending on which entity computes the location. On the other hand, the learned CIR features can also depend on the hardware of the end device. For instance, the CIR can have different shapes depending on the type of UWB device [15]. Therefore, models trained on features from one hardware model might not generalize well to others. One future research direction is to evaluate how well a model trained on one type of hardware or at one location generalizes to other device models and environments. If CIR features are indeed model-or environment-specific, one option is to train an initial model and periodically update it with new data from different devices and locations. This is also beneficial if the training is performed by the tag, since it requires less data storage and the model can be updated online. Since the residual labeling step outputs labels only for a part of the input measurements, the accuracy of the model could be improved using semi-supervised learning methods. For instance, the model can be retrained using its most confident predictions [17], which can also speed up the training process.
The localization error can also be reduced by applying Chen's residual weighting algorithm [1] or variations of it, without going through the labeling and training process. However, computing the location using all anchor combinations has a complexity of O(2 N ) for N anchors. In our case, the complexity is high only during the short training phase. After this, samples can be classified with the RF with a constant complexity of O(kp), where k is the number of decision trees and p is the maximum depth of one tree. In practice, the execution time of the proposed NLOS mitigation method (excluding the training phase) was faster than the residual weighting algorithm for N ≥ 8 anchors.
NLOS identification has interesting applications beyond reducing localization errors, especially when it can be done without supervision, as in our proposal. Our method can potentially be used to build maps of a building by aggregating the locations at which the tag is consistently in NLOS with certain anchors. Crowd density estimation is another interesting possible application. Since human body shadowing introduces large distance errors, any increase in the number of detected NLOS measurements could suggest that a room gets more populated. Note that this method does not require all users to be connected to the localization network. Finally, the trained model can be applied on individual distance measurements, so it can be useful in peer-to-peer proximity applications (e.g. contact tracing, object finding).

VII. CONCLUSIONS
We proposed a method for detecting NLOS measurements in localization systems without manually-acquired training data or knowledge of channel statistics. The method predicts the LOS/NLOS labels using the measured distances between each anchor and the tag. We use the predicted labels and the CIR features of the measurements to train a classifier, which has over 90 % classification accuracy. We also proposed a NLOS mitigation technique which reduces, on average, the mean and spread of the localization error by at least 2×.