Towards semantic‐aware multiple‐aspect trajectory similarity measuring

The large amount of semantically rich mobility data becoming available in the era of big data has led to a need for new trajectory similarity measures. In the context of multiple‐aspect trajectories, where mobility data are enriched with several semantic dimensions, current state‐of‐the‐art approaches present some limitations concerning the relationships between attributes and their semantics. Existing works are either too strict, requiring a match on all attributes, or too flexible, considering all attributes as independent. In this article we propose MUITAS, a novel similarity measure for a new type of trajectory data with heterogeneous semantic dimensions, which takes into account the semantic relationship between attributes, thus filling the gap of the current trajectory similarity methods. We evaluate MUITAS over two real datasets of multiple‐aspect social media and GPS trajectories. With precision at recall and clustering techniques, we show that MUITAS is the most robust measure for multiple‐aspect trajectories.


| 961
PETRY ET al. on similarity measurement have focused on the so-called raw trajectories, basically considering the properties of space or space-time. This is the case for Longest Common SubSequence (LCSS) (Vlachos, Kollios, & Gunopulos, 2002), Edit Distance on Real sequence (EDR) (Chen, Özsu, & Oria, 2005), and Uncertain Movement Similarity (UMS) (Furtado, Alvares, Pelekis, Theodoridis, & Bogorny, 2018). These measures are very effective for answering questions about the physical movement of objects such as which trajectories follow similar routes, or which trajectories visit a similar sequence of places.
With the explosion of big data generated on the Internet in the form of weather information, social network interactions (e.g., Facebook, Foursquare, Twitter), and geolocations (e.g., OpenStreetMap), mobility data can be enriched with several layers of semantic information. Examples are places visited or Points of Interest (POIs) (Alvares et al., 2007), the means of transportation and the goal of the trip (Bogorny, Renso, Aquino, Lucca Siqueira, & Alvares, 2014), the weather conditions while traveling, the mood of the person, and his/her posts on social media.
This new type of enriched trajectory is what we call multiple-aspect trajectory (Ferrero, Alvares, & Bogorny, 2016).
An aspect is a point of view from which an enriched trajectory may be observed (Noël, Villanova-Oliver, Gensel, & Quéau, 2015). The great challenge that we want to address in this article is how to compute the similarity of such multiple-aspect trajectories, considering several aspects together. So the question that we want to answer in this article is how similar two multiple-aspect trajectories are. How can we compare two multiple-aspect trajectories with potentially many aspects and where each of them has a number of heterogeneous attributes?
A multiple-aspect trajectory is not a simple semantic trajectory represented as a sequence of stops and moves (Spaccapietra et al., 2008), since the aspects need a more complex representation. In Figure 1 we show an example of a multiple-aspect trajectory of a tourist visiting Paris that is enriched with five aspects: visited places, weather conditions, transportation mode, social media posts, and health. Each aspect is described by its own attributes. For instance, the visited places have a spatial position, a category, a rating (the stars in the figure), and a price (the dollar symbols); the weather condition has a spatial position, a description (e.g., sunny, cloudy), and a temperature; and the aspect health has the heart rate. Among these aspects and their attributes, we observe that the attributes rating and price are related to the POI, since they specifically refer to the POI category. The attributes temperature and description refer to the weather condition, and not to the POI. Similarly, the heart rate of the object is related to the moving object and not to the POI nor the weather condition. Existing works on trajectory similarity fail to catch the relationships between attributes, because the semantics behind the trajectory attributes has not been considered so far.
The well-known similarity measures LCSS and EDR consider two points of a trajectory as similar when all their attributes match, independently of the aspect being considered, thus implying a strong dependency relationship among all attributes. This is a problem in multiple-aspect trajectories where the number of attributes is very large, and requiring a match in all attributes of all aspects significantly reduces the number of matchings. The Multidimensional Similarity Measure (MSM) (Furtado, Kopanaki, Alvares, & Bogorny, 2016), on the other hand, gives some degree of similarity if two points of a trajectory match in at least one attribute. However, MSM does not consider any relationships that may exist among attributes, thus considering all attributes as independent. Both assumptions that attributes are either all related or independent are too limiting for multiple-aspect trajectories.
F I G U R E 1 A multiple-aspect trajectory

| Problem definition
Let us consider the example shown in Figure 2, with trajectories P, Q, and R. For the sake of simplicity we consider only three attributes in the example: the category of place visited and its rating, representing the POI aspect; and the temperature representing the weather aspect. Trajectories P and Q visit the same categories of places (hotel, cafe, and museum) and with the same rating. The main difference between P and Q is that P occurs where the temperature is low, while for trajectory Q the temperature is high. Trajectory R, on the other hand, goes to different POIs (barbershop, park, and restaurant), but their ratings are the same as for trajectories P and Q, and the temperature is always low. Now let us suppose that we want to find the trajectory that is the most similar to P. In our example, trajectory Q is the most similar to P, because both trajectories visit places of the same category (museum, cafe, and hotel) and with the same rating, only differing in the temperature. However, without considering the semantics of the attributes and their relationships, trajectories P and Q and trajectories P and R would have the same similarity score by state-of-the-art methods, because they all share two common attributes: P and Q share the POI category and rating, while P and R share the rating and temperature.
We claim that when analyzing the similarity of multiple-aspect trajectories, the semantics of the attributes and their relationships is more important than simply counting the number of attribute values that match or do not match. We believe that even though trajectory P shares two attribute values with R (rating and temperature), the trajectories are semantically different, because they visit completely different categories of places. In our example the attribute rating is associated to the place visited, so its semantics relies on the aspect POI, and its meaning is lost without the POI category, so these attributes should not be disassociated. Existing measures fail to distinguish the similarity between P and Q and between P and R because they consider all attributes as dependent or independent (POI category disassociated from rating). MSM considers all attributes as independent, and so it gives the same similarity score of 0.67 for both P and Q, and P and R, given that in both comparisons two attributes of the trajectories match. LCSS and EDR consider all attributes as dependent, requiring a match for all three attributes to consider two points as similar. Hence, the similarity score is 0 between all trajectories, because no pair of points exhibits a match in all attributes.
For multiple-aspect trajectories, the number of attributes increases significantly, so existing measures tend to give misleading trajectory similarity scores, because they cannot treat attributes of different aspects and do not F I G U R E 2 Example of trajectories P, Q, and R allow the definition of attribute semantic relationships. A good similarity measure for multiple-aspect trajectories should be flexible enough to consider both independent and semantically related attributes.
In this article we propose a new, flexible similarity measure called MUltIple-aspect TrAjectory Similarity (MUITAS) for multiple-aspect trajectories, which is robust enough to consider the semantics behind trajectory attributes, considering both dependent and independent attributes, and thus allowing the definition of attributes that have a semantic relationship. MUITAS supports the use of a different distance function for each attribute, and allows the definition of a weight that represents the degree of importance of each attribute. We evaluate the proposed measure using an information retrieval and a clustering approach on two real datasets with completely different characteristics, having different and heterogeneous attributes. We use the mean reciprocal rank (Craswell, 2009), mean average precision (Manning, Raghavan, & Schutze, 2008), and hierarchical clustering (Manning, Raghavan, & Schutze, 2008) to measure the quality of our work.

| Scope and outline
The scope of this article is limited to proposing a new similarity measure for big trajectory data that involve multiple semantic dimensions. How to integrate different sources of information in order to generate multiple-aspect trajectories is a whole new world of research, and this process is outside the scope of this article. In this article we assume that the trajectories are enriched with multiple aspects.
The remainder of the article is organized as follows. Section 2 presents related work, its limitations, and its main differences with respect to our approach. Section 3 introduces the proposed similarity measure and its properties.
Section 4 presents the experimental evaluation, validating the accuracy and improvements made by our approach.
Section 5 concludes by describing advantages and limitations of this work, in addition to potential future work.

| REL ATED WORK
The similarity of sequences and time series was the primary problem discussed in the literature, long before the first researchers started analyzing actual trajectories. A well-known method for measuring the distance between time series was designed by Berndt (Berndt & Clifford, 1994), called Dynamic Time Warping (DTW). DTW aligns two sequences in order to minimize the distance between their elements. A matrix with the distances between elements of both series is created, which is then used to find the contiguous path with the minimum total distance between the series. Given the limitation of DTW to one-dimensional data, ten Holt, Reinders, & Hendriks (2007) extended it to create Multidimensional Dynamic Time Warping (MD-DTW). MD-DTW normalizes the distance of elements for all attributes and then builds the distance matrix, whose elements are the sum of the distances in all attributes for every pair of elements in the sequences. DTW and MD-DTW tend to be sensitive to noise because all elements of the sequences being compared are taken into consideration. Both DTW and MD-DTW consider a single distance function for all dimensions, and deal with numerical attributes only, so are not applicable to multiple-aspect trajectories. Shokoohi-Yekta, Hu, Jin, Wang, and Keogh (2017) proposed an adaptive DTW-based approach for multidimensional time series classification, namely DTW A . DTW A runs both an independent and a dependent version of DTW (DTW I and DTW D , respectively), and then chooses the best approach according to a scoring function and a threshold. Despite being an adaptive approach, DTW A only considers all attributes either dependent or independent, not allowing specific relationships between attributes. In addition, DTW A carries the limitations present in DTW, such as rigidity to the sequence of points, sensitivity to noise, and support for numerical attributes only.
LCSS 1 was introduced as a robust similarity measure for raw trajectories (Vlachos, Kollios, & Gunopulos, 2002). It is based on the notion that two sequences are considered similar if they exhibit similar behavior for a large part of their length. Differently from DTW and MD-DTW, LCSS reduces the impact of noisy data by defining distance and matching thresholds. Two points match and are assigned a similarity value of 1 if their distance lies below the matching threshold; otherwise, they do not match and have a similarity of 0. Although it works well with noisy data, LCSS has some disadvan-

tages. It ignores possible gaps of points in trajectories, which, for certain problems, would mean giving the same similarity value for different pairs of trajectories. A gap refers to the existence of a sub-trajectory in between two similar components of two trajectories. Additionally, LCSS considers all attributes to be dependent, so two points are similar only when
all their attributes match. With this limitation, the more trajectory attributes we consider in the similarity assessment, which is needed for multiple-aspect trajectories, the less similar trajectories tend to be. Chen et al. (2005) proposed the EDR, a distance measure for trajectories based on the edit distance that is widely used for measuring similarity between strings. The underlying idea in EDR is that, for two trajectories A and B, EDR(A,B) is given by the minimum number of inserts, deletes and replacements of points needed to transform A into B. EDR assigns 0 when two points are similar and 1 otherwise. Besides reducing the effects of noise, EDR overcomes a major drawback present in LCSS: it assigns penalties according to the length of the gaps between two matched sub-trajectories, which results in more accurate similarity scores. However, EDR also computes a match for two points only if all attributes match, which may be too restrictive for analyzing multipleaspect trajectories.
An important point about LCSS and EDR is that both measures were proposed when trajectory data were still limited to space and time dimensions. Therefore, it was appropriate to consider all attributes as interdependent.
However, with multiple-aspect trajectory data and many different attributes, these measures are not robust in the similarity assessment. Kang, Kim, and Li (2009) proposed the Common Visit Time Interval (CVTI), a similarity measure for trajectories in a cellular space. Instead of analyzing trajectories in a geometric space such as the Euclidean space, CVTI considers a discrete space of cells. The similarity score is then computed by considering the cells visited by trajectories and the shared time interval in the same cells. However, CVTI is limited to the space and time dimensions, neither allowing multiple attributes nor different relationships between them.
The Maximal Semantic Trajectory Pattern (MSTP) similarity (Ying, Lu, Lee, Weng, and Tseng, 2010) was one of the first measures proposed for trajectories to consider their semantic dimension. MSTP computes the similarity of two trajectories based on the longest common subsequence of visited POIs. However, MSTP does not take into account the space and time dimensions, as well as other attributes present in multiple-aspect trajectories. Furtado et al. (2016) presented MSM, a new similarity measure that overcomes several limitations of previous works, because it explicitly includes the semantic dimension in addition to space and time. MSM also defines weights for every attribute, given that an attribute might be more or less important for different problems.
Essentially, given two trajectories A and B, for every point of A, MSM looks for the best match in B. Subsequently, the weighed scores of the matches are added to compose the parity of A with B. Since the parity is not symmetric,

MSM(A,B) is computed by the average of parity(A,B) and parity(B,A). Rather than considering pairs of points only
if they match for all attributes, MSM treats all attributes separately, and assigns partial similarity according to the number of attributes in which the points match. This flexibility tends to increase the general similarity score. MSM disregards any relationships that might exist between aspects or attributes, making it less robust for multiple-aspect trajectories.
More recently, Furtado et al. (2018) proposed UMS, which is more robust than previous measures with regard to different sampling rates and the heterogeneity of raw trajectory data. Despite its robustness, UMS is limited to spatial attributes, thus focusing on spatial similarity, and is not appropriate for multiple-aspect trajectories. Table 1 summarizes the main related works. To the best of our knowledge, at the time of writing there is no similarity measure in the literature for multiple-aspect trajectories. Indeed, previously mentioned works address trajectory similarity regarding trajectory attributes in either a too restrictive or too flexible manner. MUITAS is more flexible than existing measures because it supports full attribute relationships as in previous works, partial attribute relationships, as well as no attribute relationships. Table 1 compares the characteristics of the main approaches discussed and our similarity measure, such as robustness to noise, use of different distance functions for different attributes, and the ability to consider attribute relationships. As shown in Table 1, MUITAS has the challenge to group together the main characteristics of other measures, thus supporting multiple-aspect trajectories. It is worth mentioning that only MSM and MUITAS were developed for trajectories with semantic attributes.

| MU ITA S: MULTIPLE-A S PEC T TR A JEC TORY S IMIL ARIT Y ME A SURE
In this section we introduce the fundamental concepts of our work and we define MUITAS, a similarity measure for multiple-aspect trajectories. In Section 3.2 we introduce a running example of MUITAS.

| Basic concepts and the proposed measure
We begin by defining the terms aspect and multiple-aspect trajectory. To the best of our knowledge, these definitions are new. Definition 1. An aspect is a set A={a 1 ,a 2 ,…,a l } of l characterizing attributes that semantically represent A.
An aspect is essentially any sort of information that can be annotated to a trajectory. For instance, we may define aspects such as the weather, the POI, and the means of transportation. The weather may have as attributes the description of conditions, the temperature, and the humidity; a POI could be described by the attributes type, rating, and price tier; and the means of transportation could be characterized by its type and average speed. These different aspects and their attributes are associated to the trajectory points, as stated in the following definition.
Definition 2. A multiple-aspect trajectory is a sequence of points T=〈p 1 ,p 2 ,…,p n 〉, with p i =(x,y,t,A) being the ith point of the trajectory at location (x,y) at time-stamp t, described by the set A={A 1 ,A 2 ,…,A r } of r aspects.
Definition 2 states that a multiple-aspect trajectory is annotated with any sort of information, which we call aspects. A point of a multiple-aspect trajectory can be as simple as a point of a raw trajectory ( = �), or a more complex element with other aspects besides space and time. In order to measure the similarity between two multiple-aspect trajectories it is necessary to quantify the distance between points. Notice that attributes may refer to different types of data, and so for each point we must quantify the distance for each attribute. Having distinct Definition 3. Let P and Q be two multiple-aspect trajectories P=〈p 1 ,p 2 ,…,p m 〉 and Q=〈q 1 ,q 2 ,…,q n 〉. For any two points p ∈ P and q ∈ Q, the distance between p and q on an attribute a i of an aspect A j is given by the function dist i :p × q → ℚ. Two points p ∈ P and q ∈ Q will match on attribute a i if dist i (p,q)≤δ i , where δ i is a distance threshold for attribute a i .
For each attribute a different distance function can be used, such as the Euclidean distance for a spatial attribute, a hierarchy-based distance for the category of a POI, or a simple discrete distance for the weather condition.
As a different distance function can be used for each attribute, the measure becomes feasible for a variety of applications. Moreover, as the distance functions can be different for each attribute, the thresholds are also variable.
For instance, for the spatial distance the threshold could be 100 m, while for the time distance the threshold could be 10 min. Having defined the way we measure the attribute distance, we now must define how to aggregate attributes that belong to the same aspect. For this we introduce the concept of feature.
Definition 4. A featuref={a 1 ,a 2 ,…,a z } is a non-empty set of attributes that describe a unit of analysis of a multiple-aspect trajectory.
To avoid misunderstanding and conflict between concepts, we hereafter refer to attribute as an atomic view An application essentially defines the context of the problem, that is, how trajectories will be analyzed. preferences regarding the price levels of the places visited are more important (i.e., feature f 1 ), then the weights could be 1/2, 1/4, and 1/4, respectively.
Given an application , we must now define how to measure the similarity between trajectory points and the trajectories themselves. Definition 6 presents the score function used to compute the similarity score between trajectory points.
Definition 6. Given two trajectory points p ∈ P and q ∈ Q, and an application = (,,Δ, ,), the matching score between p and q is given by the function score:P×Q→[0,1], defined as: where At this point, we have the basic definitions necessary to propose the multiple-aspect trajectory similarity measure. Furtado et al. (2016) define a parity function which is the basis of the similarity measure MSM. The parity function adds the scores of the best matches of the points of one trajectory with points of another trajectory. We use the same function in our similarity measure, given in the following definition.
Definition 7. Given the set  of multiple-aspect trajectories, and two multiple-aspect trajectories P and Q in , the parity of P with Q is given by the function parity: 2 → [0,|P|], defined as follows: It is worth highlighting that, differently than existing similarity measures, MUITAS allows the definition of relationships between attributes for assessing trajectory similarity. The similarity of two multiple-aspect trajectories P and Q, computed by MUITAS, is given by the average parity of P and Q, which is given in the following definition.
Definition 8. MUITAS. Given the set  of multiple-aspect trajectories, and two multiple-aspect trajectories P and Q in , the similarity score of P and Q is calculated by the function MUITAS: 2 → [0,1], defined as: Similarly to MSM, MUITAS has the properties of non-negativity (Lemma 1), relaxed identity of indiscernibles (Lemma 2) and symmetry (Lemma 3).
Proof By Definition 6, if dist i (p,q)≤δ i for all attributes a i in a feature f k , then match f k (p,q) = 1. Hence, score(p,q) = 1, because match f k (p,q) = 1 for all features f k ∈ . Therefore, parity(P,Q)=|P|, because for any p ∈ P there is a q ∈ Q where score(p,q) = 1. Similarly, parity(Q,P)=|Q|. By Definition 8, MUITAS(P,Q) = �P�+‖ ‖+�Q� = 1. If P = Q, then by Definition 8 MUITAS(P,Q) = 1. On the other hand, if for one attribute a i and at least one point p ∈ P there is no q ∈ Q such that dist i (p,q) ≤ δ i , then score(p,q)<1, parity(P,Q) < |P| and, therefore,

Proof. Direct from Definition 8.
To better understand the proposed measure and how it differs from state-of-the-art alternatives, we now compare the similarity scores for the introductory example in Section 1.

| Running example
In this section we present a running example using trajectories P, Q, and R introduced in Section 1 (see Figure 2), for which existing measures give undesirable results. As previously mentioned, we want to find the trajectory most similar to P. We instantiate an application = (,,Δ, ,), for which Table 2 describes the set of features , the weights , the attributes , the distance functions , and thresholds Δ. For the sake of simplicity, all distance functions are binary (i.e., any two attributes match only if they are equal). Also, we defined the feature weights according to the number of attributes the features contain, in order to make a fair comparison with MSM.
Let us compute the similarity of P and Q. The first step is to compute the similarity scores between all points of both trajectories. Starting from p 1 and q 1 , there is a match on the feature f 1 because the category and rating attributes of the points are equal, so the score for p 1 and q 1 is 2/3. For the points p 1 and q 2 , the only attribute they have in common is the rating. Therefore, the score for p 1 and q 2 is 0, because the rating is in the feature f 1 , and so the categories should also be equal for a match on feature f 1 to occur. Finally, p 1 and q 3 have no attributes in common, so there is no match between the points and their score is also 0. To compute the similarity of trajectories P and R we also need to compute the scores between their points, as shown in Table 4. The score for p 1 and r 1 is 1/3, because the only feature in which they entirely match is f 2 with the temperature attribute. Although they have the same rating, their categories are different. The same occurs for all point comparisons, in which the points only match on feature f 2 . We have parity(P,R)=parity(R,P)=3×1/3=1, and so MUITAS(P,R)=1/3. Table 5 shows the similarity scores given by other measures for P and Q, and P and R. MUITAS is the only measure that is able to distinguish between Q and R, assigning a lower similarity score for P and R, thus retrieving only Q as the more similar to P.
In the next section we present an experimental evaluation on real-world datasets.

| E XPERIMENTAL E VALUATI ON
In this section we evaluate the accuracy of the proposed similarity measure using two real trajectory datasets with different characteristics, to show the robustness of MUITAS in different application domains: a dataset of Foursquare check-ins in the city of New York collected between April 2012 and February 2013 (Yang, Zhang, Zheng, & Yu, 2015); and a dataset of semantic trajectories collected in Pisa, Italy, between May 20, 2014 and September 30, 2014 2 . We evaluate the Mean Average Precision (MAP), the Mean Reciprocal Rank (MRR), and we perform Hierarchical Clustering Analysis (HCA), similarly to the evaluations reported in (Chen et al., 2005) and (Furtado et al., 2018). The similarity measures were implemented in Java 3 , and the experiments were conducted on a PC running Linux Ubuntu 18.04 LTS, equipped with an Intel Core i7-3630QM CPU @ 2.4 GHz × 8 and 6 GB RAM. Next, we describe the datasets (Section 4.1), the ground truth definition (Section 4.2), the experimental setup (Section 4.3), and the results (Section 4.4).

| Datasets
The Foursquare dataset contains 227,428 check-ins of 1,083 different users, and each check-in is composed of a time-stamp and the corresponding Foursquare venue ID. We then collected venue information, including the spatial position, rating, and price tier, from the Foursquare API (https ://devel oper.fours quare.com/). Subsequently, historical weather data were collected via the Weather Wunderground API (https ://www.wunde rgrou nd.com/ weath er/api/) and combined with each Foursquare check-in. Table 6 describes the attributes and distance functions used for each attribute. We enrich trajectories with aspects that may either influence or affect the movement behavior of the moving object. For instance, we consider the weather because a user may visit a park on sunny days, but might prefer going to a museum if it is raining. This is to show that our measure supports multiple aspects.
The Pisa dataset was collected by 157 volunteers in Pisa, via an app installed on the user's mobile phone. The trajectories used in this experiment are composed of movement segments that represent the user's daily routine.
Each segment was annotated with the means of transportation, the purpose of the trip, the weather conditions, the distance traveled and time duration. In total the dataset has 10,880 segments, each described by the attributes shown in Table 7. Both the Foursquare and Pisa datasets are important for our evaluation because they contain multiple-aspect information.
Having described the datasets, we next describe the ground truth defined for evaluating the similarity measure.

| Ground truth definition
The check-ins of the Foursquare dataset and the segments of the Pisa dataset are not labeled with a class and, for that reason, we use a similar approach to the trajectory-user linking problem introduced by Gao et al. (2017) to evaluate our method. We applied a few transformations to the datasets in order to ensure variability and consistency, as described below.
We first removed 26 check-ins with missing information about their category on Foursquare. Next, we removed 21,332 noisy check-ins that belong to broad categories such as roads, rivers, and neighborhoods, because the geographic location is unique for each venue. Subsequently, we removed 1,230 check-ins that were duplicated, considering a 10-min threshold.
We then created weekly trajectories of check-ins for each user, given the whole set of check-ins. Considering that human behavior has high spatio-temporal regularity (González, Hidalgo, & Barabási, 2008), the weekly trajectory of a user is probably more similar to trajectories of the same user and less similar to those of other users.
Hence, we labeled each weekly trajectory with the corresponding user, which defines our ground truth. We filtered the weekly trajectories in order to ensure variability in the evaluation: we removed short trajectories with less than 10 check-ins and removed all trajectories of users with less than 10 trajectories. The final data set con- For the Pisa dataset we created daily trajectories, because, differently from the Foursquare dataset, the trajectory points are less sparse and represent the detailed user movement and daily routine. In order to ensure variability, we removed small trajectories with less than three segments and then removed users with less than five trajectories. The final dataset contains a total of 8,800 segments in 1,535 daily trajectories of 67 different users. The trajectories have an average length of ~6 segments and an average of ~23 trajectories per user. In the next section we detail the metrics used to evaluate the results.

| Experimental setup
Similarity measures are commonly used in clustering analysis, recommendation and information retrieval systems. We evaluate the proposed method with three different analyses also performed in previous works (Chen et al., 2005, Furtado et al., 2018, Esuli, Petry, Renso, & Bogorny, 2018. We measure the MAP and the MRR in an information retrieval task, and we perform HCA using EDR, LCSS, MSM, and MUITAS to compute the similarity between trajectories. We do not compare our work to MD-DTW, UMS, and DTW A , because multiple-aspect trajectories have categorical attributes, and these measures were designed for numerical attributes of time series and trajectories. MAP and MRR are rank-based measures commonly used for evaluating information retrieval systems.  We run complete-linkage hierarchical clustering and evaluate the generated clusters using the F-score, as described in (Manning et al., 2008). The F-score weighs individual cluster quality and the number of clusters generated. For instance, as the number of classes within a cluster increases, precision decreases and the score is penalized. Similarly, as the number of clusters exceeds the number of classes, recall increases and the F-score falls. In other words, the F-score is equal to 1 if and only if all clusters are pure and the number of clusters is equal to the number of classes.
We instantiate an application = (,,Δ, ,), where the attributes  and distance functions  are presented in Tables 6 and 7 for each dataset. Tables 8 and 9 show the thresholds Δ tested for each attribute in the datasets.
Only attributes with a threshold value greater than 0 are displayed.  for most users. MSM had better results than EDR and LCSS, and especially for HCA. This confirms our claim that some attributes may be independent, since MSM does not consider any relationships between attributes.

| Results and discussion on Pisa and Foursquare datasets
MUITAS achieved the best averages regardless of data set and evaluation technique, since it allows partial relationships to be defined, considering both dependent and independent attributes. MUITAS is neither too strict (like LCSS and EDR) nor too flexible (like MSM). It is also important to highlight that MUITAS achieved better results independent of data set, thresholds and evaluation technique.
In order to evaluate the significance of the results we also perform a statistical analysis using the analysis of variance test (Montgomery, 2017), with level of significance α=0.05, which results in a p-value less than 0.05. One important remark on these experiments is that the best results for both datasets were obtained with the measures MSM and MUITAS, specifically developed for semantic trajectories, and which according to Table 1 do not consider the element sequence. This means that for the Foursquare and Pisa datasets, the sequence of user behavior in trajectories may not be relevant in discriminating them from each other.
In summary, the results in these datasets show that using MUITAS to measure similarity gives more precise results (in terms of MRR and MAP measures) and more concise clusters (HCA), in comparison to state-of-the-art trajectory similarity measures.

| CON CLUS I ON S AND FUTURE WORK
The enrichment of movement data with different contexts and several data attributes has led to a new type of trajectory, which we call multiple-aspect trajectory. We claim that for a better understanding of movement patterns of human mobility, these attributes should be considered in the similarity assessment. However, these heterogeneous data attributes, including space, time, and several layers of semantics, make the trajectory similarity problem more complex than traditional spatio-temporal data. To the best of our knowledge, there are no similarity The bold values represents that our method (MUITAS) outperformed previous works in all three experiments measures in the literature that consider multiple-aspect trajectories. In this article we propose MUITAS, a similarity measure that supports both independent and dependent attributes, a different distance function for each attribute, as well as a weight that represents the importance of each attribute. The state-of-the-art methods consider all attributes as independent or dependent. MUITAS overcomes this limitation by allowing partial attribute dependence. Indeed, a distinctive characteristic of MUITAS is the definition of the features of an aspect as part of the application definition that drive the similarity measurement. It is important to point out that MUITAS does not depend on the specific application domain and it can be easily applied to different scenarios.
In order to evaluate the relevance and effectiveness of MUITAS we performed a robust experimental evaluation on two real-world datasets with complementary characteristics with three different evaluation techniques.
The results showed that MUITAS is more accurate than existing trajectory similarity measures.
Even though we focused on multiple-aspect trajectories, the proposed similarity measure can be applied to any type of trajectory or sequenced data in a variety of applications. In future work we will analyze the similarity of heterogeneous points of trajectories, and propose an extension of MUITAS to support trajectories that have points with different aspects and/or different attributes.  Bollobás, Das, Gunopulos, and Mannila (1997), we only consider the most recent approach proposed by Vlachos et al. (2002) for trajectory data, since it is more robust than the former.