A Privacy-Aware Remapping Mechanism for Location Data

In an era dominated by Location-Based Services (LBSs), the concern of preserving location privacy has emerged as a critical challenge. To address this, Location Privacy-Preserving Mechanisms (LPPMs) were proposed, in where an obfuscated version of the exact user location is reported instead. Adding to noise-based mechanisms, location discretization, the process of transforming continuous location data into discrete representations, is relevant for the efficient storage of data, simplifying the process of manipulating the information in a digital system and reducing the computational overhead. Apart from enabling a more efficient data storage and processing, location discretization can also be performed with privacy requirements, so as to ensure discretization while providing privacy benefits. In this work, we propose a Privacy-Aware Remapping mechanism that is able to improve the privacy level attained by Geo-Indistinguishability through a tailored pre-processing discretization step. The proposed remapping technique is capable of reducing the re-identification risk of locations under Geo-Indistinguishability, with limited impact on quality loss.


INTRODUCTION
Location-Based Services (LBSs), while undeniably valuable in enhancing the convenience and efficiency of our daily lives, can give rise to significant privacy concerns.These services rely on tracking and storing an individual's real-time location data, which, if misused or mishandled, could lead to severe repercussions.From the perspective of personal privacy, the constant monitoring of an individual's movements can paint an intimate picture of their habits, preferences, and even sensitive activities [9,12,24].In the wrong hands, this data could be exploited for targeted advertising, identity theft, or surveillance, compromising individuals' autonomy and security.Additionally, under inference attacks, location-based services might (un)intentionally disclose sensitive locations, like home, workplace, health and religious institutions, as well as information about users, their habits and conditions, thus making users vulnerable to potential threats.
Over the past two decades, important achievements have been accomplished in user protection, specifically in the fields of anonymization and obfuscation techniques [6,13].Anonymization involves modifying or removing personally identifiable information from datasets, making it challenging to link specific data points to individual users.On the other hand, obfuscation mechanisms introduce noise or perturbations to location data, making it more challenging to pinpoint an individual's exact location.Despite their differences, both methods make use of spatiotemporal generalization, which involves aggregating or reducing the granularity of location data to a certain level.By doing so, they mask precise details about a user's movements and activities, preserving their anonymity while still providing useful information for analysis or services.
Geo-Indistinguishability [2] is acknowledged as the of state of the art in Location Privacy-Preserving Mechanisms (LPPMs), where Planar Laplace (PL) was the first mechanism proposed to achieve it.Built upon the foundation of differential privacy [7], Geo-Indistinguishability ensures that even when sharing location information for various services or applications, an individual's true whereabouts remains hidden within a radius  with a level of privacy that depends on the radius, while providing a delicate balance between utility and privacy.
Remapping techniques have been proposed for Planar Laplace to increase the utility of the queries without degrading the privacy level by feeding the noisy generated location to a remapping function which relocates them in a more suitably new location, considering the remapping function metric [4].In fact, the PL mechanism with optimal remapping is considered the state-of-the-art of Geo-Indistinguishability in sporadic location privacy [21].The optimal remapping techniques only use the current obfuscated LPPM output and the mobility profile of the user for mapping an obfuscated location into a grid-based discrete location [2,4].Remapping targets preserving the privacy guarantees while maintaining the quality loss in the same order of the noised location generated through the privacy-preserving mechanism.However, we verified that after applying a privacy-preserving mechanism and remapping into a grid, the users' locations become much more unique due to the Therefore, we propose a novel algorithm which produces a remapping function from the cells of a uniform grid into itself.Our proposed remapping mechanism aggregates groups of cells from a uniform partitioning of the domain of locations.This process of cell aggregation takes into consideration the frequency of locations that actually appear in the dataset, to produce a smaller set of utilized cells, i.e. cells effectively used during the discretization process.At a small cost in terms of utility, our discretization method makes it more difficult for an adversary to re-identify individuals.Our remapping function strategically transforms cells within the grid, effectively diminishing the impact of weighted noise introduced through Geo-Indistinguishability techniques, yielding a more sensible remapping of locations, ensuring that the privacy enhancements gained from obfuscation methods are not compromised, whilst resulting in a discrete dataset.
The major contributions of this paper are summarized as follows: • identification of flaws from remapping Laplacian noised locations into a uniform grid-based discretization mechanism; • proposal of a novel algorithm which addresses the flaws identified from a uniform grid at a small cost of utility.
The remainder of the paper is organized as follows: we review the existing works in Section 2. Section 3 describes our remapping mechanism.We evaluate our proposed method in Section 4 and discuss the obtained results.Section 5 concludes our work.The notation used throughout the paper is presented in Table 1.

BACKGROUND AND STATE OF THE ART
This section provides an overview of background concepts and state-of-the-art approaches that are taken into consideration in the development of the Privacy-Aware Remapping mechanism.First, in Section 2.1, we grasp the essence of what is a mechanism that achieves Geo-Indistinguishability and consider one such method in Section 2.2.In Section 2.3 we discuss about discretization techniques and their properties.At last, we introduce the main focus of this work: the composition of obfuscation methods alongside discretization techniques, in Section 2.4.

Geo-Indistinguishability
Geo-Indistinguishability [2] is a privacy concept and technique that ensures that an individual's exact location is indistinguishable from a set of nearby locations, thereby preventing the precise identification of a user's movements and activities.Geo-Indistinguishability achieves this by introducing noise or perturbations to the location data, making it challenging to link specific location traces to a particular individual.The level of indistinguishability is controlled by a parameter called the privacy budget , which determines the amount of noise added to the data.The main idea behind a Geo-Indistinguishable mechanism is the guarantee that the user location  is indistinguishable to any other nearby location  ′ based on the obfuscated report .
Formally [17], denoting by K an obfuscation mechanism which assigns to every true location  ∈ X a probabilistic distribution on Z, the set of all obfuscated locations, this mechanism satisfies -geo-indistinguishability iff for all ,  ′ ∈ X: where  x (•) is any distance function and  P (•) is the multiplicative distance between two distributions, defined as: where  1 and  2 are two distributions on some set S, and following the convention that L = log Intuitively, condition (1) states that the probability of reporting location  while standing in location  is similar to that of standing in any location  ′ [16].In particular, both probabilities differ at most by the distance between  and  ′ factored by a small constant .This constant is usually set to  = / [2], which represents a simple way to specify a user's privacy requirements -level of privacy  within a radius  , enforcing that any  ′ within  distance of  discloses  information, at most.

Planar Laplace Mechanism
Planar Laplace (PL) [2] was the first mechanism proposed to satisfy Geo-Indistinguishability and consists of adding 2-dimension Laplacian noise centered at the true user location .Formally, for all  ∈ X and  ∈ Z, the probability density function (pdf) is given by: Obtaining , the obfuscated location, from  can be efficiently done using polar coordinates [2]: (1) draw  uniformly in [0, 2) (2) draw  uniformly in [0,1) (3) set  =  −1  () where the cumulative densitity function (cdf)   ( ) represents the probability that the radius of the random generated point falls between 0 and  , which uses PL's cumulative distribution function defined in (3).Therefore, using the Lambert  function at branch -1, the inverse function is defined as: Finally, simply report  =  + ⟨ cos( ),  sin( )⟩.We will denote by   :  → Z the above mechanism.

Data Discretization
Location Data discretization is a crucial pre-processing step in managing and sharing location-based data effectively while upholding individual privacy.It involves transforming continuous and precise location coordinates in R 2 into a discrete set of points W, such as cells or clusters.Additionally, it allows us to reduce the granularity of the data, making it less precise and less likely to pinpoint individuals' exact locations, while still providing high levels of utility.
Our proposed mechanism, Privacy-Aware Remapping, will dispatch locations to others and take advantage of the finite set of potential locations resulting from discretization.One widely used discretization approach is a grid-map method, used in various scenarios [2,4,11,18,19,22,[25][26][27], where geographic regions are partitioned into a grid of uniform cells.Another technique involves clustering methods [8,10,23,29], which group similar data points together, creating clusters that represent geographical areas.Additionally, Voronoi diagrams [14,20] divide a geographical space into cells based on the proximity to specific data points, ensuring that each region is associated with the nearest data point.
We decided to proceed with the study of the grid-based discretization technique, mentioned as the uniform grid throughout the paper, due to its distinct advantages.One of the key reasons for selecting this method is its simplicity in storage.The uniform grid discretization can be efficiently represented as a matrix, making it an ideal choice for data storage as well as data manipulation.Furthermore, this technique allows for precise control over the size of the discrete locations, which is especially valuable when analysing data transformations, like privacy-preserving methods such as Geo-Indistinguishability.
A uniform grid, denoted by G, partitions geographic regions into cells of constant size.Grid discretization is a powerful yet simple technique used to efficiently represent and process location data.Given four corners of a bounding box as well as a constant value of a cell spacing, the mechanism divides the box in  =  0 ×  1 cells, represented by a  0 -by- 1 matrix.Real-time services, such as navigation systems, and points-of-interest (POI's) finders, can benefit from this structured representation since the obfuscation can be calculated in real-time -simply pinpoint the cell it contains the ground-truth, i.e. the real location.Moreover, the quality loss resulting from this discretization is bounded and predictable since each cell maintains a constant size.Denoting by  the grid cell's spacing and considering that the reported location from a cell is its center, the maximum quality loss obtainable is / √ 2 (the points within the cell furthest from its center are its corners).
We will denote the function which discretizes real locations to a cell of the grid G as  : X/Z → G.

Remapping Locations
We are now prepared to discuss how to add a layer of discretization after applying Laplacian noise to a real location, which is the main focus of this work.
In [2], it has been proven that grid map discretization preserves Geo-Indistinguishability.The authors have defined the probabilistic mechanism   : G →  (G), where   ( |) represents the probability of reporting the cell  when the actual ground-truth location is , with  ∈ X and  ∈ G.The mechanism involves generating (,  ) and computing , as previously described, and then remapping  to the closest point  on G. Formally, given  ∈ X we obtain  = ( •   )() and report its center.
Let us focus on Figure 1.We extracted a user's trajectory from the data we used throughout the paper and applied   to demonstrate how it affects the utilized cells (i.e. cells from the grid that contain location reports).As it can be seen, the application of a mechanism such as PL spreads out the true locations, sometimes even to cells from the grid which were previously empty.This result implies that a larger amount of discrete points (center of cells) are needed to discretize every true location, i.e. the number of unique locations in the discrete set of true locations, X * , will be much smaller than the number of unique locations in the discrete set of obfuscated locations, Z * .

Figure 1: Noise added by PL mechanism and how it affects the utilized cells
This result can contribute to privacy threats to the individual, as we will see in Section 4.2.The remapping that our algorithm produces, introduced in the following section, will come in as a privacy context-aware choice which decreases densely the amount of clusters needed to discretize the data, while still achieving Geo-Indistinguishability.Additionally, we will verify how benefiting the use of a smaller set of clusters will also dramatically decrease the number of individuals affected by re-identification attacks thought their most visited and preferred locations.

A PRIVACY-AWARE REMAPPING MECHANISM
The algorithm we propose generates a remapping function, denoted by R, which maps individual cells within a grid to themselves.The distinctive property that R offers is its ability to minimize the weighted quality loss inherent from the conjugation of Laplacian noise after remapping to a grid.When applied to a specific cell, R () will identify and designate the cell that minimizes the weighted quality loss, considering all the possible cells which   might report locations from the cell  to.We will then verify how R is actually nor injective, i.e. given two cells  1 ,  2 ∈ G s.t. 1 ≠  2 does not imply that R ( 1 ) ≠ R ( 2 ); nor surjective, i.e. not for all  2 must exist a  1 s.t.R ( 1 ) =  2 .So the number of cells which will get reported with R will be smaller than the number of cells that a uniform grid requires.We will show how that provides extra privacy guarantees, thus making re-identification of users more challenging.Furthermore, this remapping generates a grid composed of multiple aggregated cells, each contained by unique cells and a single output center which minimizes our metric, so it is able to determine which areas will need more or less generalization.This section is divided as follows: in Section 3.1 we present the pseudocode to the generator of the Privacy-Aware Remapping; in Section 3.2 we explain how remapping was designed to be used and some of its properties; and finally, we discuss the computation complexity in Section 3.3.

Algorithm in a Nutshell
The pseudocode of the generation of the remapping function is listed in Algorithm 1.The inputs to the algorithm are the dataset M, a uniform grid G, and the obfuscation radius  related to the Laplacian noise.The algorithm will then return a remapping map R : G → G.
In the initialization phase, the remap R is initialized as an empty map (line 1).Afterwards, using the subroutine  ℎ (line 2), we build a weight function  s.t. () holds the number of ground-truth reports which lay on the grid cell .This can be achieved by a simple iteration over every report from the dataset, translating the coordinates into cells from the grid, and increment accordingly.
The algorithm then enters the main loop (lines 3-17), which iterates over all cells  from G and calculates the best candidate  ′ to get reported instead, according to a metric we will now describe.Let   () denote the set of cells where Planar Laplace might send locations in  to, i.e.   () will contain every cell  ′ such that the distance from  to  ′ is less than or equal to  .In the for conditions of lines 6 and 8, we verify if  ′ and  ′′ are in   (), respectively, where   denotes the great circle distance.
Finally, based on the Bayesian remap [4], we define the optimal candidate for remapping  as the cell that effectively minimizes the weighted generated quality loss, formally described as: where   (•) denotes the euclidean distance between the two cells, since each cell can be represented as  and  offsets from matrix of the grid G.This is accomplished on lines 8-14.

Remapping Planar Laplace Points Using R
Using our Privacy-Aware Remapping R will result in an extra step before reporting the obfuscated location.As before, given the actual location  ∈ X, we generate (,  ) and compute  ∈ Z following the PL methodology (Section 2.2).Afterwards, we get  ∈ G, the cell where  is contained on the uniform grid and we finally report the cell R ().Formally, we obtain  = R ( (  ())) and report its center.
Let us discuss some properties which this remapping provides.The computation of the function R will generate group of cells which will report the same cell.Let   represent the set of cells which get remapped to , i.e. ∀ ′ ∈ G: Note that  might not necessarily belong to   : the algorithm might find a more suitable cell from the set   () which decreases the overall weighed quality loss (Equation 5).This abstraction of the remapping R allows us to see the uniform-grid as a coarser grid, by considering every set   ≠ ∅ as unified cells which have as its center the same cell .
Additionally, due to the nature of the algorithm, we found that it is highly likely that there will always exist some cell  such that |  | > 1.That can be justified by the simple fact that two neighbour cells  1 ,  2 will have identical sets of obfuscation centered at each cell, respectively   ( 1 ) and   ( 2 ), so the one cell which minimizes the weighed quality loss of each cell will be, most likely, one cell on the intersection of both sets.Naturally, this result depends on the grid spacing , the privacy parameter , as well as the obfuscation radius  , since these will determine how much the obfuscation sets of neighboring cells will intercept each other.Therefore, assuming an appropriate configuration of the parameters according to the dataset and the desired utility/privacy trade-off, the R remapping is, by definition, not injective, implying that is also not surjective, since the domain and co-domain sets are the same, then not for all  2 must exist a  1 s.t.R ( 1 ) =  2 , through the pigeonhole principle.

Computational Complexity
In order to determine the exact computational complexity of Algorithm 1, let us consider a dataset with |M| reports, a grid G with  cells of constant size , as well as a radius of obfuscation  .
At the initialization phase, the first instruction (line 1) takes constant time.Computing the weight  of each cell (line 2) requires to verify in what cell each report from M lays.That can be done in O (|M|).
The main loop from line 6 to 15, which runs  times, requires some enhancements to avoid an O ( 3 ) algorithm, which would quickly become infeasible for grids with tens of thousands of cells.For each  ∈ G, we are interested to perform a nested-loop on every cell in   (), which are all the cells that are at a distance of at most  from , as explained before.Instead of linear search between every cell to verify if the condition is met (as described in Algorithm 1), one can simply consider a square of cells centered at  composed by ⌈2 •   + 1⌉ 2 cells, since the obfuscation circle with center  and radius  is inscribed in this square, as it can be seen in Figure 2 (the green area corresponds to   () and the red area to the search space of the algorithm).So the complexity of this loop is O ( • ⌈   + 1⌉ 4 ), which can be simplified to O ( • (   ) 4 ) assuming  is rounded to the next closest multiple of , as well as following the properties of the Big-O.
Therefore, the overall complexity of our algorithm is O (|M| +  • (   ) 4 ), so it grows as the grid spacing decreases (   and  increases) as well as when considering a greater obfuscation radius.Note that the main loop can be done concurrently, so it is possible to divide the workload among multiple processing units or threads, resulting in a feasible execution time when considering large datasets and small levels of granularity.
In the next section, we provide reasoning for the additional computational cost needed to construct the remapping function, as opposed to the straightforward use of the uniform grid.

EVALUATION AND DISCUSSION
To evaluate the effectiveness of the conjugation of PL mechanism with remapping using only the uniform grid and the additional remapping of R, we selected a real mobility dataset, Geolife [31], which was collected in a period of over three years from GPS devices.The dataset contains data from 182 users, 17,621 trajectories and roughly 25 million reports.Following [15], we first limited the distribution of reports to a bounding box over 5 th ring road of Beijing, China.It is defined from South and North by the latitudes 39.753, 40.026, and from West and East by longitudes 116.199, 116.547, still leaving us with approximately 16 million reports.This division allowed us to focus on a high traffic urban area surrounded by the suburbs with a lower density.
As constant spacing of the cell's grid, we fixed the value of 100 meters [3,28], which we found to provide a reasonably high level of resolution for most practical purposes.For values of the privacy budget , we used multiple values in the typical ranges of privacy-preserving mechanisms for continuous reports [1,5,15], specifically  = [4, 8, 16, 32] km −1 .For the PL, this corresponds to an average obfuscation of [500, 250, 125, 62.5] m, respectively.With these values across the 100 meter intervals of the grid cells, we can investigate scenarios where the obfuscation, on average, extends far beyond the confines of the location's cell, approaches the cell boundary, or remains entirely contained within the cell itself.
For the value of  , the obfuscation radius, let us recall (Equation 4) the inverse cumulative density function of PL and how this function, given a probability , returns the radius  for which the probability of falling within that radius is equal to .Note that: since lim →1 −1  = 0 and lim →0   () = −∞ for all  ≠ 0, so the obfuscation radius given by  −1  could get as large as possible as  gets closer to 1. Therefore, there is not a  we can feed to  −1  which would bound all the noise added from all obfuscated locations.So we decided to set  =  −1  () + / √ 2, with  equal to 95%.The additional / √ 2 represents the maximum quality loss generated by discretizing a real location on a uniform grid, as mentioned in Section 2.3.This way, we can focus on the obfuscation circle centered at each cell  ∈ G with radius  , knowing that the 95 th percentile of all obfuscated locations at  are contained in the circle.

Quality Loss
Quality loss is a point-by-point metric, measuring the quality lost between the ground-truth and the data obtained after applying a privacy-preserving mechanism, i.e. for an original location  ∈ X and the respective obfuscated location  ∈ Z, the quality loss is given by   (, ), the Euclidean distance metric.When we are considering remapping after using PL, we actually compute   (, ) where  ∈ G represents the center of the cell where  is contained.
Figure 3 quantifies the quality loss of our Privacy-Aware Remapping compared to a plain Uniform Remapping and Planar Laplace without Remapping (  ), for different values of privacy budget, .Initially, one can see how little affects remapping PL to a uniform grid.Since the maximum introduced error by this operation is / √ 2, this cost will be negligible as  decreases, i.e. as obfuscation increases.The weighed quality loss used in our proposal (see Equation 5), although related, does not necessarily decrease the overall quality loss.Let  ∈ X,  the output of   evaluated on , and  the cell where  is contained.With high probability (95% from the configuration of the obfuscation radius  ), the true location  will be contained in the circle of obfuscation centered at cell  with radius  .On the other hand,  ′ = R (), by definition, will be also a cell from the circle.Therefore, the overall introduced error must be bounded by the diameter of the circle, i.e.   (,  ′ ) ≤ 2 •  .Additionally, we found that the smaller the , the larger the loss of utility with remapping is.This behaviour is already expected when applying PL, i.e. the average obfuscation from a privacy budget is given by 2/ so the obfuscation amount increases inversely to .As it can be observed, using the Privacy-Aware Remapping introduces an average obfuscation of 2/ +, where  > 0 also grows inversely to  but  ≪ 2/.For instance, an epsilon value of 4 km −1 resulted in an average obfuscation of 414 meters and, consequently, to an average quality loss of 416 and 478 meters when applying Uniform Remapping and Privacy-Aware Remapping, respectively, which is still under the expected obfuscation of 500 meters.Therefore, we argue that the amount of noise generated with R remapping is in the same order of the noise from Uniform Remapping.
In the upcoming sections, we will furnish beneficial outcomes that can rationalize the introduced supplementary error.

Number of utilized cells
We focused on comparing the number of utilized cells with and without the R remapping, having as baseline the number of unique cells needed to discretize the non-obfuscated data, which we refer as ground-truth.As previously described, an utilized cell is a cell from  2 summarizes the obtained results.Firstly, note that the number of utilized cells without obfuscation is constant to all privacy parameters , equal to 41120.Following, note the effect of the 's values have on the number of utilized cells.As mentioned in Section 2.4, applying   will spread out the true locations (see Figure 1) and, as the privacy parameter  increases, so does the radius of obfuscation.Therefore, Uniform Remapping contributed to a large increase of the size of this set and, consequently, there will be more unique discrete locations than unique ground-truth locations, leading to user's trajectories to become overall more unique.This result can lead to some privacy threats in the case where the anonymity mechanism performs poorly, for instance when the applied noise maps true locations into non-admissible ones or sparsely frequented, like into the sea or mountain.Although possible, it is unlikely to receive queries at such locations, so an adversary can take advantage of this information.
On the other hand, our Privacy-Aware Remapping was able to reduce around 300% of the utilized cells in comparison with Uniform Remapping.At a small cost on utility, the proposed remapping drastically reduces the number of discrete points, which has its advantages at the computational level, also leading to better results under inference attacks, as we will see in the next section.Furthermore, the remapping into non-admissible locations will rarely happen, since those locations have no weight associated, not contributing to the weighted quality loss metric used to determine the optimal R (), for every .

Top-𝑵 Re-Identification Attack
The top- re-identification attack [30] measures the risk of a privacy threat that revolves around the concept of identifying individuals based on their location data.In this attack, an individual is considered re-identifiable if their top- visited locations are unique, even if the actual identity of the person is anonymized.This attack leverages the uniqueness of an individual's movement patterns and frequently visited places to de-anonymize them.
For each  , we compute the anonymity sets, namely the number of users with the same top- preferential locations.Therefore, we define a user as re-identifiable if the size of the anonymity set is 1, i.e. the user's top- most visited locations identify the user without ambiguity, meaning that there is no other user with the same top- visited locations.If a noise-based privacy mechanism, like   is applied to the data, we add to the condition of being considered as re-identified the following: the user's top- produced from the unaltered data must be the same as the top- produced from the obfuscated data.This way, even if a user is in a unitarysized anonymity set, if the obfuscation mechanism changed the top-locations, then we consider the user as non re-identifiable, although, in this case, the user can be considered as unique.
Figure 4 depicts the percentage of re-identified users when applying the top- attack with the ground-truth data, i.e. the nonobfuscated data remapped into a grid, and using the Uniform and Privacy-Aware Remapping.Note that the results from the nonobfuscated data do not vary with different values of the privacy parameter.Following [30], we considered the top-1, top-2 and top-3 locations of each user.Intuitively, for the ground-truth data, top-1 is included in top-2, as well as top-2 in top-3, so one can only expect a greater amount of re-identifications as  increases.As it can be seen at the ground-truth results, at a such fine level of granularity (regions of 100 meters), an attacker can easily re-identify a large chunk of users, even when considering small sets of top-locations, reaching upwards to 90% of re-identification when considering the top-3 locations of each user.
Due to the extra rule we added to consider a user as re-identified in case a privacy mechanism is applied, re-identification no longer grows as the set of top-locations enlarges.As the value of  increases, most likely the user's top- produced from the unaltered data differs from the top- produced from the obfuscated data.Notice now how decreasing the privacy parameter, where one should expect privacy to be favoured, also decreases the overall re-identification.As  decreases, higher amounts of obfuscation gets added to the real location, so it becomes easier to protect a user's top-locations.For example, for  = 8 km −1 and  = 1, the top- attack with Uniform Remapping was able to re-identify 27% in comparison to 83% when not applying any obfuscation to the data, which are already substantial results.Privacy-Aware Remapping was able to decrease this value even further to around 5%.This represents an improvement of 81% in comparison to the Uniform Remapping and 94% to the ground-truth data.In a lot of other cases, our mechanism was able to achieve total user protection, representing considerable results in comparison to the Uniform Remapping.
Finally, our Privacy-Aware Remapping decreases immensely the re-identification, most of the times achieving total user protection, thus showing an adequate utility/privacy trade-off, since a quality loss of the same order as the Uniform Remapping is accompanied by a large reduction of the number of re-identified user.

CONCLUSIONS
As Location-Based Services become ubiquitous, the need for effective privacy mechanisms cannot be overstated.State-of-the-art approaches resolve into Geo-Indistinguishability, a privacy concept that prevents the precise identification of a user's location.For an extra layer of protection, discretization comes into play, transforming continuous locations into a discrete set of points.We have verified how the conjugation of these methods, namely Planar Laplace with a grid-based discretization technique (mentioned as Uniform Remapping in this paper), highly increases the number of utilized cells, i.e. the cells needed to fully discretize a dataset.The sparseness of locations resulting from the large number of utilized cells, can lead to threats coming from non-admissible locations, thus affecting the performance of anonymization mechanisms.
In response to this challenge, we have introduced a novel approach: Privacy-Aware Remapping.This mechanism builds upon the foundation of a grid-based discretization technique and unifies cells to enhance privacy protection.While it introduces a marginal amount of extra quality loss, this loss is directly correlated with the chosen privacy parameter in   , therefore the Privacy-Aware Remapping introduces noise in the same order as Uniform Remapping does.
One of the key strengths of our proposed mechanism lies in its significant reduction of cluster regions, referred to as the utilized cells.Our remapping function, by design, is not surjective, resulting in a smaller number of output cells compared to the total available cells.This reduction not only streamlines data storage and processing but also minimizes the risk of potential breaches.
Privacy-Aware Remapping demonstrates a robust defense against re-identification attacks, particularly those leveraging top- locations.In most of the cases, it is able to achieve a level of protection tantamount to total user anonymity.This outcome reinforces viability and effectiveness of our approach in securing the privacy of individuals utilizing location-based services.
For a future work, it would be interesting to evaluate the behaviour of the proposed remapping technique against attacks other than the top- .As the number of discrete regions tend to decrease significantly, more information about a user tends to be aggregated among others, and, therefore, we expect that this mechanism behaves competently facing other inference attacks.

Figure 2 :
Figure 2: Circle of obfuscation in green and algorithm's square of search in red

Figure 3 :
Figure 3: Quality loss obtained when using Planar Laplace with no Remapping and when applying Uniform and Privacy-Aware Remapping, for different 's Number of utilized cells  = 4  = 8  = 16  = 32 Ground-Truth Remapped 41120 Uniform Remapping 81945 74191 65367 56113 Privacy-Aware Remapping 29255 24447 21327 19539 Table 2: Number of utilized cells when using Ground-Truth Remapped Data and when applying Uniform and Privacy-Aware Remapping for different 's the grid which has location reports contained.Table

Figure 4 :
Figure 4: Percentage of re-identified users with the Top- Re-Identification Attack, i.e. the real locations remapped to the uniform grid, and for the Uniform and Privacy-Aware Remapping, for different privacy parameters 's and  's

Table 1 :
General Notationsspreading effect, which could pose a threat if the anonymization poorly protects user traits.