Using Fuzzy Ontology to Improve Similarity Assessment: Method and Evaluation

Assessing semantic similarity is a fundamental requirement for many AI applications. Crisp ontology (CO) is one of the knowledge representation tools that can be used for this purpose. Thanks to the development of semantic web, CO‐based similarity assessment has become a popular approach in recent years. However, in the presence of vague information, CO cannot consider uncertainty of relations between concepts. On the other hand, fuzzy ontology (FO) can effectively process uncertainty of concepts and their relations. This paper aims at proposing an approach for assessing concept similarity based on FO. The proposed approach incorporates fuzzy relation composition in combination with an edge counting approach to assess the similarity. Accordingly, proposed measure relies on taxonomical features of an ontology in combination with statistical features of concepts. Furthermore, an evaluation approach for the FO‐based similarity measure named as FOSE is proposed. Considering social network data, proposed similarity measure is evaluated using FOSE. The evaluation results prove the dominance of proposed approach over its respective CO‐based measure.


INTRODUCTION
Similarity reasoning is the identification of syntactically different concepts that are semantically close. Assessing concept similarity is growing in importance within ontology engineering and, in particular, ontology merging and ontology alignment. 1 An ontology is a knowledge representation mechanism, which is understandable by intelligent agents. It consists of a hierarchical description of concepts in a particular domain connected by taxonomic and non-taxonomic relations. 2 It is employed in reasoning about domain concepts. Ontologies are the fundamental infrastructures in semantic web. 3 With the rapid development of the semantic web, it is likely that the number of ontologies will greatly increase during the next few years, which leads to the arising demand for rapid and accurate assessing concept similarity. 4,5 In this context, assessing concept similarity becomes more important in the presence of vague information. When some relations between domain concepts are vague or when there is uncertainty in defining a concept, 6 these types of problems can be tackled with fuzzy information. Fuzzy logic theory was proposed by Zadeh 7 and later applied successfully in various research areas. [8][9][10][11][12] Fuzzy logic, as a powerful infrastructure in uncertainty management, was coupled with the ontology to originate fuzzy ontology (FO). 13 FO is a generalization of crisp ontology (CO) where fuzzy relations exist between crisp concepts. FO have been successfully implemented in several application areas such as news summarization, 14 diet recommendation, 15,16 flight booking, 17 information retrieval, [18][19][20][21][22][23][24][25] reputation management, 26,27 collision avoidance, 28 and knowledge mobilization. 29,30 However, the literature on FO-based assessment of similarity is limited to the use of formal concept analysis (FCA). FCA is concerned with the formalization of concepts and conceptual thinking. 31 A key limitation of FCA-based approach, however, is that it necessitates a particular type of world modeling, that is, conceptto-attributes, which may not be applied in all situations. In addition, considering the need for human intervention in concept-to-attributes database creation, FCA-based approach is semi-automatic rather than fully automatic and of exponential time order, O(2 N ). 32 This article presents a measure for assessing concept similarity based on FO. With respect to other papers defined in the literature, the key concepts underlying the proposed approach are its independence of FCA and lower pre-required complexity. As a case study, this measure is then used to generate a similarity matrix of concept pairs in the context of social networks (SNs).
Furthermore, to evaluate the proposed semantic similarity measure, a new evaluation approach is proposed. The approach, named as FO-based similarity evaluation (FOSE), is the first data-driven evaluation approach of FO-based semantic similarity as far as this study is concerned. FOSE is then applied on our case study to evaluate proposed FO-based similarity measure.
The rest of this paper is organized as follows: Section 2 defines some notations and terminology used in the rest of this paper. Section 3 is devoted to the proposed FO-based assessing concept semantic similarity. In Section 4, the proposed approach for evaluating a FO-base similarity measure is introduced. A case study of deployment of proposed measure in a context of SNs is covered in Section 5 and finally comparison with literature work, concluding remarks, and future work studies are discussed in Section 6.

NOTATIONS AND PRELIMINARY DEFINITION
In this section, we formally define some terms and describe the notation used in this paper.
In formal terms, a fuzzy set can be defined as follows: DEFINITION 1. A fuzzy set S over the universe of discourse X is defined by its membership function μ S , which maps S elements to a value between [0 1] interval. 17 where S is the fuzzy set and μ is the membership function. μ s (x) represents the degree to which x belongs to S. If X is continuous, then S can be rewritten as follows: Additionally, S can be organized in to an ordered set of pairs as follows: Typically, an ontology is illustrated as a directed acyclic graph (DAG) or a hierarchy, in which nodes correspond to concepts and edges represent relationships between pairs of concepts. In some ontologies, there is only one relationship between nodes, whereas in more general case, there exist more than one relationship between nodes. 34 The most common type of ontology relation is the taxonomical "is-a" relation, which indicates the similarity of concept pairs. In a hierarchy corresponding to an ontology, there is a node specified as the root. The root is the starting node. A path is a sequence of adjacent (via the edges) nodes in the hierarchy. The name of each node at an intermediate level is associated with a parent, one or more child nodes and one or more sibling nodes. 35 Parent node is the node one level higher in the hierarchy. Inversely, if a node is a parent of another node, the node is called a child of the parent. Consequently, a node may have several parent nodes, and vice versa. Sibling nodes share the same parent. The depth of a node is the length of the path to its root. Let us use parent(c), sibling(c), and depth(c) operator to demonstrate the parent, sibling set, and depth of a node c in a hierarchy, respectively. A node that is connected to all lower-level nodes is demonstrated by ancestor(c). Given two nodes in an ontology, they must share a set of common ancestor nodes, and the one with the highest depth is typically referred to as the lowest common ancestor (LCA) of the two nodes. Discarding the direction of the edges in an ontology, there exists at least one path between every pair of two node. 34 Among all possible paths between two crisp concept c 1 and c 2 in an ontology, the one passing their LCA is the shortest path (sp) between two concepts, that is, sp(c 1 , c 2 ) = |path (c 1 , LCA(c 1 , c 2 )| + |path(LCA(c 1 , c 2 ), c 2 )|, where |path(c i , c j )| counts the number of edges (relations) in the path from c i to c j . DEFINITION 5. A FO is defined by means of fuzzy relations characterized by a membership function. Considering Equation 5, this type of ontology is formulated as follows: where, R f is a (binary) fuzzy relation over two countable crisp set of concepts 36 Figure 1 where crisp concepts are connected to each other by fuzzy "is-a" relations and a fuzzy degree of membership, μ(x)in [0,1] interval is assigned to each relation.
Let R 1 (x, y), (x, y) ∈ X × Y and R 2 (y, z), (y, z) ∈ Y × Z be two fuzzy relations. The max-product composition of two fuzzy relations R 1 and R 2 is denotes as R 1 • · R 2 (x, z) and defined as follows 37 A dataset, X = {x 1 , x 2 , . . . , x N } is a set of N objects or data points represented as feature vectors in a F-space. A distance matrix of a dataset, dist(X), is a matrix demonstrating the distance between each pairs of X elements. Having an ontology O = (C, R, D) of N concepts, a distance matrix is corresponded to C, which determines the distance of concept pairs. Semantic similarity (distance) computes the similarity between concepts that need not be lexically similar. Semantic distance can be inferred from web data. Statistical analysis of web files for a set of concepts C results in a distance matrix denoted as web_dist(C). Another approach for the assessment of concept semantic distance is based on an ontology. DEFINITION 6. Having a CO of N concepts as CO = (C, R, D), the semantic distance of concept pairs can be calculated based on ontology relations, R. Accordingly, the result is a distance matrix of concept pairs with the size N × Nwhich is denoted as CO_dist. However, if the source ontology is a FO, the distance measure would be a FO-based distance measure, denoted as FO_dist.
Having two distance matrix of objects (concepts), their correlation and variation can be evaluated by Pearson correlation, relative root squared error (RRSE), and relative absolute error (RAE) as follows: Relative root squared error (A, Relative absolute error (A, Pearson correlation measures the strength of a relation between two variables A and B. a and b denote different values of variables A and B, respectively.ā and bspecify the average value of variables A and B, respectively. The higher value of Pearson correlation indicates a more dominant relation between two variables. The two other criteria, RAE and RRSE, measure relative variation between a target variable, A, and its predicted value, B. RRSE calculates the root of squared error between target variable A and its predicted value B. RAE on the other hand calculates the absolute value of this variation. The smaller RAE and RRSE indicate a better prediction. Here |x| refers to the absolute value of x.
In this context, the problem we want to solve is the following: having a set of concepts of web data, assess their semantic similarity based on FO and evaluate the results.

A NEW SEMANTIC SIMILARITY MEASURE BASED ON FO
Semantic similarity computes the similarity between concepts that need not be lexically similar. 38 Ontology-based analysis has been a popular approach in recent years for semantic similarity assessment. However, they mainly utilizes CO structure. 34 Despite the information richness of FO, it has been considered to a limited extent in the literature for semantic similarity measurement. A FO is more informative in comparison with a CO. In addition to providing taxonomical relations of concepts, it provides information on the strength of the relation between concept pairs. Accordingly, similarity assessment based on FO could provide results that are closer to reality. The input for similarity assessment based on FO is a FO of concepts connected by fuzzy relations. If a FO is not available, a CO can be converted to a FO using the algorithm proposed by Ref. 39. Their approach generates a FO from a CO by means of a distance matrix of concepts inferred from the web, dist_web(C). This approach maps a two-dimensional dist_web matrix to a nested ontology structure to generate a FO.

Assessing Concept Similarity Based on FO
This section aims at proposing a novel approach for semantic similarity assessment based on FO. Before proceeding, let us define some propositions, which will be used later in this section: r Two completely similar objects give the maximum similarity, (sim (x, y) = 1), whereas the least similar pairs give the minimum value (sim (x, y) = 0). That is, similarity is the complement of dissimilarity in the range of [0 1] and so one can be easily derived from the other as follows: 40 r In a FO = (C, R f , D), the fuzzy value of taxonomical "is-a" relation between concept pairs demonstrates their level of belonging to each other. Accordingly, the fuzzy value of taxonomical "is-a" relations existing in a FO can be considered as the similarity of concepts, s. t., between c i to c j . Assuming c i as the node with lower depth in comparison with c j , s. t. depth(c i ) ≤ depth(c j ), then relations between c i and c j lie in one of the following two categories: 1. c i is the parent or an ancestor of c j . 2. c i and c j are neither parent nor ancestors but are connected to each other via common ancestors.
According to this partitioning, our proposed approach for semantic distance assessment of any concept pairs (c i, c j ) in a FO is defined by Algorithm 1.
If c i is an ancestor of c j : As mentioned earlier in Equation 13, for concepts directly connected to each other by "is-a" relation, the fuzzy value of their relation, that is, μ R (c i , c j ), is set as their similarity. However in case that there are some intermediate relations on the path from c i to c j , all these relations are composed using fuzzy max-product composition as defined in Equation 8 in order to generate the final value for similarity of c i to c j . Afterward, considering Equation 15, their distance equals their similarity subtracted from 1.
More precisely, let us assume that there are k intermediate concepts between c i and c j . The similarity of c i to c j is calculated by max-product composition (Equation 8) of all intermediate fuzzy relations on the path from c i to c j as follows: and accordingly their distance is equal to: FO dist (c 1 , c 10 If c i and c j are connected via a common ancestor(s), which means that the path from c i to c j passes from their common ancestor(s), the pass containing their LCA is the shortest path between c i and c j . 41 Subsequently, their distance is set as the sum of the distance of each to their LCA. To calculate the distance of each to their LCA, the method defined in previous section is used, s. t. the max product composition of relations on the path from each node to their LCA is calculated. As an example consider Figure 2 again. The similarity of c 8 and c 10 is calculated by consideration of the similarity of each to c 3 , which is their LCA.
Finally, the distance of each node to itself is set as 0. Consequently, the proposed distance measure can be summarized as follows: International Journal of Intelligent Systems DOI 10.1002/int

Examples
To illustrate the behavior of our approach, let us consider following portions of two FOs as depicted in Figures 1 and 2 and calculate semantic similarity of some of their concepts: Example 1. Consider the FO of Figure 1, which consists of six concepts connected by five fuzzy relations. According to the proposed method the similarity between "Computer Science" and "Regression Analysis" and the distance between "Reinforcement Learning" and "Data Mining" is defined as follows:

Properties of the Proposed Measure
In order to show the validity of the presented measure, we have studied the properties that a distance measure must fulfill. It is important to note that the fulfillment of those properties is a requirement if the measure is used in conjunction with some reasoning techniques. 42 A distance function must satisfy three properties: positivity, minimality, and symmetry as stated by: 43

Proof.
Positiveness: Calculation of fuzzy distance between concepts A and B consists of three steps. First fuzzy value of the shortest path from each node to their LCA is calculated then these values are subtracted form 1 and afterward they are summed up. The first part, calculation of distance of each node to their LCA, is composed of multiplication of some fuzzy values. Accordingly, the result lies in [0 1] interval. Afterward, the result is subtracted from one. Since the result lies in [0 1] interval its subtraction form 1 is always positive and then two positive numbers are added which results in a new positive number.
Minimality: As mentioned earlier, the distance of each node to itself is set to zero.

A NEW APPROACH FOR EVALUATING FO-BASED SIMILARITY MEASURE
The literature on evaluation of ontology-based similarity measures is limited to consideration of CO. This section first investigates some criteria that can be considered in evaluation of a FO-based similarity measure. Next, a novel evaluation approach named as FOSE is introduced. To the best of our knowledge, FOSE is the first data-driven evaluation approach for FO-based similarity measures.

Possible Evaluation Criterion for a FO-Based Similarity Measure
To evaluate a CO-based dissimilarity measure, a common approach is to compare it with web data. 44 However, to evaluate the power of fuzziness in a model, it is common to compare it against its equivalent crisp model. Accordingly, in our evaluation in addition to consideration of web-based distance of data, web_dist, we consider its equivalence CO-based distance measure, CO_dist.
CO_dist: To have an acceptable evaluation, the underlying logic for calculation of CObased measure, must be similar to proposed FO-based similarity measure. Rada 45 proposed an approach for CO-based similarity measure which utilizes the concepts of shortest path between concepts and consideration of their LCA in the same way as our proposed FO-based similarity measure. Accordingly, in our evaluation we consider Rada measure for calculation of CO-based distance measure which is defined as follows: web_dist: In order to evaluate distance based on web data, terms (concepts) co-occurrences in web files is considered, which is a common literature approach. 26 Co-occurrence refers to the number of times two specific concepts have been appeared concurrently in a same file. This criterion is used to evaluate web-based distance of concepts.
FO_dist: Proposed FO-based distance measure (Equation 16) is used to calculate concepts distances based on FO.
the distance matrix based on FO, FO_dist. The three distance matrix are put aside as follows to compare the results of proposed FO-based similarity measure.
1. Evaluation of the correlation between web_dist and CO_dist, and compare it with the correlation between web_dist and FO_dist. 2. Evaluation of the RAE between web_dist and CO_dist, and compare it with the RAE between web_dist and FO_dist. 3. Evaluation of the RRSE between web_dist and CO_dist, and compare it with the RRSE between web_dist and FO_dist.
The correlation, RAE, and RRSE are calculated as in Equations 9, 10 and 11, respectively.

The Proposed Approach for FOSE
According to previous section, evaluation of a FO-based similarity measure requires to compare it against its corresponding CO-based similarity measure. The criterion for comparison could be their RAE (Equation 11) to a standard web-derived distance matrix which both are built upon on. This evaluation approach clarifies how well FO has measured similarity in comparison with CO.
Considering the proposed FO-based similarity measure introduced in Section 3 and its respective CO-based similarity measure introduced in this section, a new evaluation criterion for FOSE method is proposed noted as FOSE. The criterion is ratio based. RAE of FO-based similarity matrix and web_dist matrix is set as its numerator and RAE of CO-based similarity matrix and web_dist matrix is set as its denominator. Accordingly, FOFOSE criterion is defined as follows: where web_dist matrix is the web-based distance matrix as defined in Section 4.2.
FO_dist is the distance based on FO using proposed method (Equation 16), CO_dist is the distance based on CO using Equation 17 and RAE(A,B) equals the RAE of point to point elements of A and B which is calculated as in Equation 11. FOSE considers three distance matrices to evaluate a FO-based distance measure: the distance matrix generated from web-data, (web_dist), the distance matrix based on CO, (CO_dist), and the distance matrix based on FO, (FO_dist). Comparing the RAE between web_dist and CO_dist with that of web_dist and FO_dist, FOSE evaluates a FO-based similarity measure. Thus, the FOSE value less than 1 indicates the superiority of FO over CO. The range of FOSE possible values and their meaning is demonstrated in Table I.

CASE STUDY
In this case study, we focus on the effectiveness of the FO-based semantic similarity measure in the context of SN. In our context, we focus on LinkedIn,  46 Members can create customizable profiles that detail employment history, business accomplishments, and professional competencies in their area of expertise. Consequently, they may develop contacts, find jobs, and answer questions.

FO Development
In LinkedIn SN, each person fills out his/her own profile with a set of skills that defines his/her areas of expertise. A screen shot of a LinkedIn's profiles skills section is depicted in Figure 3. Subsequently, by crawling profile data of 130 unique users, various skills were collected from their LinkedIn profiles.
In order to create an ontology of skills for the SN, individual skills must be clustered as the groundwork for ontology construction. A common metric for web data clustering, is terms co-occurrences. 47,48 Co-occurrence refers to the number of times two individual terms have been used within the same text file. Mapping this definition to our context is equivalent to the number of times two terms have been used concurrently as an individual skill. Calculation of this metric resulted in 28,536 couples of co-occurred terms. By the accomplishment of this stage, dataset specifications were extracted as outlined in Table II and the web-based distance matrix is constructed based on concept co-occurrences.
To create a CO of terms, agglomerative hierarchical clustering (AHC) is applied on the skill set. AHC forms the clusters from "bottom-up" and is a common literature approach for learning of an ontology from a data set. 49 Having generated the CO, the corresponding CO-based distance matrix is generated using Equation 17.

Assessing FO-Based Concept Similarity
Having a CO, FO is generated by the algorithm, 39 which maps a twodimensional distance matrix of concepts to hierarchical ontology structure. For the generated FO, distance of concept pairs is calculated by the proposed approach (Equation 16) and FO-based distance matrix is determined.

Evaluation of FO-Based Similarity Measure
At this step, the FO-based distance measure, FO_dist, is compared against the CO-based distance measure, CO_dist, using correlation (Equation 9), RRSE (Equation 10) and RAE (Equation 11), and FOSE criterion. The overall evaluation results are summarized in Figures 4-7. Table III outlines the four dataset's specifications that were considered in this experiment. All datasets are generated from skillsets of LinkedIn profiles (Table II). As is demonstrated, datasets are sorted in ascending order based on their size. Figure 4 demonstrates the correlation between FO_dist and web-based distance matrix, web_dist, with that of CO_dist and web_dist for all datasets. According to    Figure 4, FO has a higher correlation with the initial web-based distance matrix compared with CO. Considering Table III, the larger datasets result in higher superiority of FO in comparison with CO for the conceptual modeling of the real world. This shows the potential of this approach for larger datasets as is necessary for SN analysis.
In Figures 5 and 6, the comparison is performed between FO-based similarity measure and its corresponding CO-based similarity measure by means of RAE and RRSE criteria. Figure 5 demonstrates the RAE between FO_dist and CO_dist with web_dist. As illustrated for all datasets, RAE of FO-based measure is lower that CO-based similarity, which underlines its dominance in semantic similarity assessment.
In the same way, Figure 6 illustrates the RRSE between each of FO_dist and CO_dist with web_dist. Results indicates FO-based measure superiority over CObased measure especially when the size of datasets increases.
Eventually, the proposed evaluation criterion, FOSE, is calculated for the proposed FO-based similarity measure and summarized in Figure 7. Considering Table  I, values less that 1 and close to 0 indicates the FO-based similarity measure superiority over CO-based similarity measure. Subsequently, the decreasing trend of FOSE values by increasing datasets size and the near to zero, (0.07), value for the largest dataset indicates proposed method perfection.

CONCLUSION AND FUTURE WORK
In this section, first the works of other authors, whose proposals are close to ours are reviewed and compared with our proposal. Then the paper is concluded and directions for future works are outlined.

Related Work
In this section, we first review the literature on CO-based semantic similarity measures and afterward consider the literature on a more narrowed topic of FO-based similarity measures.
Common approaches for CO-based similarity measures can be classified to edge-counting, information-based, and feature-based approaches. 50,51 Edge-counting approaches refer to the group of methods that utilize ontology structure to assess the similarity. Rada 45 considered the length of the shortest path between ontology nodes. The longer the path, the more semantically far the concepts are considered. Later on, Refs. 52-54 integrated some other features of ontology structure like its depth to improve the accuracy of this similarity measure.
Features-based approaches consider the degree of overlapping between sets of ontological features. Concepts are considered as sets of features and their evaluation is estimated based on the number of similar features they have. In this way, common features tend to increase similarity and non-common ones tend to diminish it. This approach is originated from Tversky model 55 and has been applied in molecular biology, 56 adaptive e-learning 57 and ontology merging 58 and alignment. 59 Information-based approaches calculates statistical specification of concepts based on a corpus or other data sources. 60 Some of common statistical specifications in the literature are concept frequencies, 6 which refers to the number of occurrences in the corpus and co-occurrences, 61 which considers simultaneous occurrence of two concepts in the same file.
FO-based similarity assessment is mainly obtained by fuzzy FCA (FFCA). FFCA-based approaches lie in feature-based approaches since they consider common features of concepts in the lattice to assess their similarity. FFCA has been used in various domains. For instance 62 used it in combination with WordNet a for similarity assessment and 44 used an ontology obtained by FCA to assess similarity for products for information retrieval. A highlighted work of this domain is Ref. 61, which has used FFCA method in combination with information content theory to assess similarity. Despite its popularity, FCA-based approach 63 is of high computational complexity. Its pre-requirement is to build formal concept lattices, which is a complex task of O (2 N ) time order.
The proposed measure assess similarity based on the shortest path between concepts which is an edge-counting approach in combination with fuzzy grades of membership which is an information-based approach. Accordingly, proposed measure lies in a hybrid category as illustrated in Table IV.

Summary and Future Work
Assessing concept semantic similarity is a critical module in many applications of artificial intelligence. This paper proposed a novel approach for concept similarity assessment which is based on FO. The proposed approach incorporates fuzzy relation composition in combination with an edge counting approach to assess the similarity. Accordingly, proposed measure relies on taxonomical features of an ontology in combination with its fuzzy grades of membership. Consequently, the approach lies in the hybrid category of assessing similarity that utilizes both taxonomical and information content metrics.
Differently to the literature on FO-based semantic similarity measures, the proposed approach does not utilize FCA. Considering the limited world modeling of FCA and its high computational complexity, the proposed approach has lower pre-required complexity and higher flexibility. a http://wordnet.princeton.edu/ Furthermore, an evaluation method for FO-based similarity measure, named as FOSE, was proposed, which determines the variation between proposed FObased similarity measure with real world-data and compare it with that of CO-based similarity measure. As far as the present research is concerned, FOSE is the first data-driven evaluation approach for FO-based similarity measures.
To evaluate the proposed measure by means of FOSE, a case study of LinkedIn SN was considered. Experimental results reveal superiority of proposed FO-based similarity measure in comparison with CO-based measures with respect to its correlation to the real world data and error minimization. Our future work is concerned with assessing similarity of concepts in a context with higher degrees of uncertainty based on interval and general type-2 fuzzy ontologies.