On Ranking RDF Schema Elements (and its Application in Visualization)

: Ranking is a ubiquitous requirement whenever we confront a large collection of atomic or interrelated artifacts. This paper elaborates on this issue for the case of RDF schemas. Speciﬁcally, several metrics for evaluating automatic methods for ranking schema elements are proposed and discussed. Subsequently the creation of a test collection for evaluating such methods is described, upon which several ranking methods (from simple to more sophisticated) for RDF schemas are evaluated. This formal way for evaluating ranking methods, apart from yielding credible and repeatable results, gave us some interesting insights to the problem. Finally, our experiences from exploiting these ranking methods for visualizing RDF schemas, speciﬁcally for deriving and visualizing top-k schema subgraphs, are reported.


Introduction
As the modern society and economy increasingly depends on a deluge of digital information, the need for ranking becomes more and more crucial.The Semantic Web, has so far focused only on the task of ranking ontologies, a task that could aid the human-enacted process of ontology selection [Ding et al. 2005], [Patel et al. 2003], [Buitelaar et al. 2004], [Alani and Brewster 2005], [Sabou et al. 2006].This paper focuses on the problem of ranking the elements of a single ontology; a problem that has not been studied in the literature so far.The motivation for this kind of ranking, is that it can alleviate the human effort required for understanding the contents and the structure of one ontology.This task is quite hard for ordinary users, also aggravated by the lack of satisfying ontology visualization tools, hence it is an obstacle for the realization and deployment of the Semantic Web.In addition, ranking could be exploited in a plethora of other tasks, e.g. for ordering results of queries that return schema elements.
In this paper we elaborate on the problem of ranking schema elements.We start by introducing and discussing a number of metrics (mainly originating from the area of Information Retrieval) which could be used for evaluating automatic methods for ranking schema elements.Subsequently, we introduce, and formally evaluate, several possible ranking methods.This is another rather unique characteristic of our work, as most (probably all), other works on related topics, do not follow any formal evaluation method.Subsequently, we discuss how we have exploited these ranking methods for offering visualizations that can help users understand quickly an ontology.Specifically, the derivable top-k lists and top-k diagrams allow exploring an ontology gradually: from the more important elements to the less (Figure 1 illustrates the idea).
Although we confine ourselves to RDF schemas [Brickley and Guha 1999], most of the material and results presented in this paper could be applied on any object oriented schema, i.e. a schema defined using the classical o-o structuring mechanisms: classification (objects and classes), attribution, associations (between classes), and generalization/specialization (among classes and among associations).
This paper is organized as follows.Section 2 proposes and discusses metrics for evaluating schema elements ranking methods and Section 3 describes the test collection that we have built.Section 4 defines RDF schemas and introduces notations, and Section 5 defines several ranking methods for RDF schemas.Section 6 reports the results of evaluating these ranking methods on the test collection.Section 7 describes an application of ranking methods on visualization and discusses related work.Finally Section 8 concludes the paper and identifies issues for further research.

On Evaluating Ranking Methods for Schema Elements
Ranking structured data and knowledge is a relatively new task which becomes more and more important.However, most of the works around this topic lack a formal evaluation methodology.In this section we elaborate on various metrics that could be used for evaluating automatic methods for ranking schema elements.

Evaluation Metrics
One approach to evaluate a ranking method is to suppose that for each schema there is an ideal ranking of its classes according to their importance.This ranking could be provided by an "oracle" or an expert (i.e. a person that knows well the schema and can provide us with such a ranking), or by aggregating the rankings provided by several persons.In the latter case, the aggregated ranking can be obtained according to various different methods (mainly coming from the area of Social Choice), like plurality ranking, Borda [de Borda 1781] ranking, Condorcet [de Condorcet 1785] ranking or Kemeny Optimal Aggregation [Kemeny 1959], but we shouldn't forget the Arrow's impossibility theorem [Arrow 1951].
.N ] defines a linear ordering of C. In particular, given two classes c i and c j , if f (i) < f (j) then f ranks c i higher than c j .We can denote this ranking by C f and write We shall use the notation Let's denote the ideal ranking by I, i.e.C I = c I(1) , ...c I(N ) .What we need is a method for comparing an automatically derived ranking C A = c A(1) , ...c A(N ) with respect to C I .Below we introduce and discuss a number of metrics that can be used for our purposes which are inspired from the measures used in the area of Information Retrieval (IR).

k-Precision
One metric used in IR, is the R-precision, where R stands for the number of documents in the test (evaluation) collection that is known to be relevant to the evaluation query.It is a single number metric which, in contrast to other measures (like precision, recall, E-measure), somehow takes into account the order of the documents returned by the system (because it considers only the R first elements returned by the system).In our case, in place of R we can use any k that is appropriate to our needs and thus define: Consider two rankings C A and C B whose first k elements contain exactly the first k elements of the ideal ranking, i.e.
, that is, they have the same order, while A is definitely better than B, this cannot be identified with the k-precision, because sim p (I, A, k) = sim p (I, B, k) = 1.To tackle such cases we can use k measures instead of one, specifically the measures 1-precision, ... , k-precision.So we can define: The results of comparing two rankings according to this k-valued measure can be illustrated by plotting the interpolating curve for each of the two sets of k values.The higher a curve (in the Y-axis) is, the better. -PR-curves We could also define a measure analog to Precision/Recall curves of IR.
However, here we have to decide which are the "relevant" elements.We can consider as relevant elements the first k elements of I.It is necessary to choose a k < N, otherwise the resulting PR-curves will coincide with the function f (x) = 1.Suppose we have selected such a k.The curve is defined as follows: we start scanning C A and when we meet the first element, say at position i, that belongs to C I {k} we compute the precision at that position, i.e. the value 1/i, when we find the second, say at position i we compute the value 2/i , and so on.We continue in this way until we have consumed the entire C A .We can then plot the interpolating curve of the values found.
Notice the differences with 1..k-precision: In 1..k-precision the nominator is One benefit of the latter is that C I {k} is more "reliable" than C I (k).We made this observation while building our test collection (which is discussed in detail at Section 3).Specifically, we noticed that experts could easily select the major 20 classes of a schema, however, it was not easy for them to order these 20 classes in a principled way (the ordering was done rather arbitrarily).Consequently, the results of an evaluation based on PR-curves is safer than those based on 1..k-precision.Finally, an obvious difference from 1..k-precision is that here we compute the precision only when a new relevant element occurs (and not at every i = 1..k).
-Recall/Fallout curves Instead of PR-curves, we can employ Recall-Fallout graphs.In IR, fallout is defined as the proportion of the non-relevant documents retrieved.In our case, and with the assumption that the first k elements of I are considered as the relevant ones, we can define fallout as follows The evaluation with RF-graphs has some theoretical advantages (over evaluation with PR-curves).Moreover RF-graphs could also be exploited practi-cally, i.e. for improving the effectiveness of the under-evaluation system (for more see [Robertson 2007]).

-Kemeny distance
One single measure that takes into account the order of the elements is Kemeny distance [Kemeny 1959].According to Kemeny, the distance between two rankings equals the number of pair-wise disagreements between them.
This measure can be applied directly if we want to compare the entire rankings.If we want to compare only C A (k) and C I (k), one problem is that they may not have the same elements, i.e. it can be C A {k} = C I {k}.One method to overcome this problem is to make the assumption that an element that is not present in C A {k} resides on the k + 1 position of C A (like in [Tzitzikas 2001]).Formally, if c i ∈ C A {k}, we can assume that A(k + 1) = i.We can make exactly the same assumption if we don't know the entire C I but only C I (k).This is very convenient for building a test collection as it is much more easy for an expert to come up with an ordered list of the top 20 elements, than an ordered list of 100 elements.Also note that we can apply this measure also to weakly ordered sets.
As all of the above metrics contain expressions of the form |C I {k} ∩ C A {k}|, it is important to be able to compute these values with accuracy.This is not always possible because a ranking method may produce a lot of ties (especially if the schema is large).This is because there are not so many things (graph features) that can be used to discriminate classes.It follows that a ranking method for schemas is expected to yield more ties than a ranking method for documents because in the latter case the documents are characterized by several words and consequently several numerical values (e.g. according to the TF-IDF weighting scheme).So it is more safe to assume that a ranking method will derive weakly ordered sets (else called bucket orders, or rankings with ties).For instance, assume that C A = c 1 , {c 2 , c 3 }, c 4 , meaning that c 2 and c 3 are equally ranked.This means that C A (2) is either c 1 , c 2 , or c 1 , c 3 , and that C A {2} is either {c 1 , c 2 }, or {c 1 , c 3 }.It follows that we cannot compute expressions of the form |C I {k} ∩ C A {k}| with accuracy.Also notice that the ideal ranking could be a weakly-ordered set too.
We can rectify this problem by adopting an approach similar to that of the Expected Search Length (proposed in [Cooper 1968]).Specifically, we can consider that every possible linear order of a weakly-ordered set is equiprobable and then compute the "expected" size of (1 + 0 + 2 + 1)/4 = 1.This method can be followed to all the above metrics.

Building a Test Collection
Here we describe and discuss how we have built a test collection.We selected three RDF schemas: (a) CIDOC CRM1 .It is the international standard (ISO 21127:2006) for controlled exchange of cultural heritage information.
(b) PhOntology2 .It is a top level ontology.
(c) Wine Ontology3 .It represents knowledge about wines and it is one of the w3c example ontologies.
We selected these three schemas for several reasons.Firstly, we need middle sized schemas, because for small ones (e.g., with 20 classes) the problem of finding top-k classes looses its interest, while for big ones (e.g., with 500 classes) the production of the ideal ranking by experts would be a very difficult task.In this manner we can observe how different ranking methods are affected by the kind of a schema (i.e., whether it is a rich ontology, a taxonomy or something in between).Although building of a bigger test collection is a challenging task, the three schemas selected here can be exploited for a first study (intended by this paper) on ranking RDF schema elements.
Table 1 shows the number of classes, properties and attributes (properties pointing to literal types) of each schema of the test collection.
Note that in general, an RDF schema may reuse and extend elements from other schemas.Our test collection comprises schemas that do not import any other schema4 .

Deriving Ideal Rankings
Here we describe how we derived the ideal ranking of each schema.We refer to CIDOC as an indicative example of the procedure followed for all schemas.
Each of the four persons involved was asked to study the schema and then select the 10 (and 20) more important classes of the schema.In particular, we asked them to select the 10 classes which according to their opinion would aid them to understand the schema if they did not know anything about it.As the selection of the more important classes is not a "well-defined" task, the evaluators were not feeling comfortable at the beginning of the ranking process.Each of the evaluators was free to follow his own process on coming up with the top-10 or top-20 classes of the CIDOC ontology.Each person needed approximately 40 minutes to complete this task.The ontology was available in the form of a printout containing a list of classes with the superclass of each and a separate printout containing a table with the domain and range of the properties.It was also available electronically in P rotégé [Noy et al. 2001] and visually using its Jambalaya viewer.
Two out of four persons begun (each one separately) by excluding the less significant classes in their opinion until they reached the 20 most important for them.Then they continued until reaching the 10 most important.The other two (one of them is actually the chair of the CIDOC CRM committee) decided on the top-10 first and then went on to pick the top-20.All noticed that, although they could easily select the major 20 classes of the schema, it was not so easy for them to order these 20 classes in a principled way.
The union of the top-20 classes of the four experts contained 36 classes.We aggregated the results using the Borda method [de Borda 1781].We assumed that each input has two blocks {top11}, {top12 − 20} .The resulting aggregated weak order (comprising 5 blocks) is shown at the left column of Table 2.
The middle and right column of Table 2 shows the result of an analogous process on the PhOntology and WineOntology.The evaluators of these ontologies were not those that evaluated CIDOC.We consider only schemas that are valid according to the RDF semantics [Hayes 2004].If c 1 , p, c2 ∈ P we shall write domain(p) = c 1 and range(p) = c 2 .Table 3 introduces notations that we shall use in the sequel.Roughly, for a given c ∈ C, we use conn(c) to denote those classes which are connected with c through a property.More precisely, we consider only user defined classes, so the elements of conn(c) are always members of C. We use to denote bag union (so the operands as well as the result of this operation can contain duplicates), e.g.  3 contains notations that take into account the semantics of RDF.

RDF Schemas
We shall also use Attrs to denote the set of all properties whose range is a literal type (i.e. This is the most simplistic method.It views the schema graph as if it was a plain graph (i.e. it does not distinguish arc types).
(m 1 ) : This is a variation of (m 0 ) that considers only the properties that define as domain the given class.
(m 2 ) : R(c If we view classes as states, and properties as bidirectional transitions, then we can realize that according to this method, the score of a class is equal to the stationary probability that a random surfer is at that class5 .Note that this method ignores properties pointing to literal types, as well as the generalization/specialization hierarchies. This is variation of (m 2 ) that also considers the properties that point to literal types.Specifically, here the probability of reaching a class after a random jump is not the same for all classes.The more attributes a class has, the more probable a surfer jumps to that class.Again, generalization/specialization hierarchies are ignored.
(m 4 ) : This is variation of (m 3 ) that takes into account the generalization/specialization relationships among classes.Each generalization relationship is viewed as a bidirectional transition.With probability q 1 the surfer jumps to a random class (with preference to those holding more attributes), with probability q 2 he selects to follow one of the isa-derived transitions, while with probability 1 − q 1 − q 2 he selects to follow one of the property-derived transitions.
This is actually a variation of (m 1 ).It is like viewing the "completed" graph, i.e. the graph that particularizes all inherited features (superclasses, subclasses, property domains and ranges), as a plain graph.
It is evident that (m 3 ) is a special case of (m 4 ) (corresponding to q 2 = 0).Similarly, (m 2 ) is also a special case of (m 4 ).
An alternative approach to rank classes, would be to first score properties and then define the scores or the ranking of the classes.One approach to define the significance of properties is to exploit the definition of betweenness centrality of edges [Freeman 1977, Freeman 1979] in a graph.According to this definition, an edge is more important than another, if it appears in more shortest paths between nodes than the other.Formally, let G : (V, E) be a graph.Then, the edge betweenness centrality BC(e) of an edge e ∈ E is defined as: BC(e) = u,w∈V,u =w σu,w (e) σu,w , where σ u,w (e) denotes the total number of shortest paths between u and w that pass through edge e and σ u,w denotes the total number of shortest paths between u and w.
Note that to compute this metric we can exclude from the schemas multipleedges because they do not affect the measure of betweenness centrality.It is evident that in this way we will actually rank the distinct pairs of classes rather than the properties.Below we present two methods for ranking classes based on the scores of the properties.

BC(p)
Here the score of a class c is defined as the sum of the BC scores of the properties that have c as domain or range.
An alternative approach is to define top-k classes as the classes that are encountered as domain or range of the top-l properties, where l is the smallest integer.We can call this method (m 7 ).

Experimental Evaluation
We decided to evaluate the ranking methods (m 0 ), (m 1 ), (m 4 ), (m 5 ), (m 6 ) and (m 7 ).As noticed in Section 5, (m 2 ) and (m 3 ) are special cases of (m 4 ) and thus, they are considered in the experimental evaluation through the appropriate values of the parameters of (m 4 ).For computing (m 4 ) scores we employed an iterative algorithm with at most 100 iterations6 .
We decided to evaluate the above ranking formulas according to the ideal ranking because this method is less subjective and its results are repeatable.Moreover the ideal ranking can serve as the baseline for comparing different ranking methods.Concerning evaluation metrics, we decided to employ the kprecision metric for various values of k.The selection of the values of k was based on the ideal ranking of the schema at hand.For instance, for CIDOC CRM we used the values 1, 7, 11, 15, 20 because these values follow the block sizes of the ideal ranking (see Table 2).As the points are not so many, we will present the results in a tabular form and not in the form of curves.However, as an automatic method for ranking may yield ties (equally scored classes), the computation of the k-precision was based on Expected Search Length (recall Section 2) so that to obtain more precise results.
We should notice that (m 5 ) gave by far the worst results.This is due to the fact that (m 5 ) always favors the classes located deeply in the class hierarchies.These classes inherit a big number of properties from their ancestors.However, psycholinguistic evidence has shown that middle level concepts tend to be more detailed and prototypical of their categories than classes at lower hierarchical levels [Rosch 1978].This fact has also been experimentally verified for RDF schemas in [Theoharis 2007].Moreover, methods (m 6 ), (m 7 ) gave always worse (or rarely the same) results than (m 0 ), (m 1 ), (m 4 ).In the sequel, for each of the three schemas of our collection we only report the results of (m 5 ), (m 6 ) and (m 7 ) without further discussion.Instead, we focus on the comparison of the rest methods (i.e., m 0 , m 1 , m 4 ).

Schema: CIDOC CRM
Table 4 shows the results of evaluating our ranking methods using 11 and 20precision.
It is interesting to note that method (m 0 ) is almost as good as (m 4 ): although it does not give the maximum values (i.e.0.72, 0.75 for top-11 and 20 resp.), it gives very close values (i.e.0.63, 0.71 for top-11 and 20 resp.).In addition, there are several combinations of q 1 , q 2 values that make (m 4 ) to give much worse than (m 0 ) results.The recall/fallout curve of methods m 0 and m 4 (Figure 5) can provide a means to compare these two methods.As we can see from the curve, (m 0 ) is slightly better than (m 4 ), since (m 0 ) usually lies closer to the We should note that each association of the CIDOC CRM ontology has been represented in RDFS as two counterpoising properties between the involved pair of classes (each labeled with a different name).This fact does not hold for attributes, i.e., properties pointing to Literal class, since Literal cannot be the domain of any property.This means that in this ontology, ranking classes according to prop(c) is almost equivalent to ranking them according to propF rom(c) or propT o(c).
Finally, Table 5 shows the top-20 distinct pairs of classes of CIDOC CRM.

Schema: PhOntology
Table 6 shows the results of evaluating our ranking methods using 10 and 20precision.
Here, method (m 0 ) gives the best results ( (m 0 ) yields much higher 10precision and slightly lower 20-precision than (m 1 ) ). Concerning (m 4 ), the Table 5: Top-20 pairs of classes of CIDOC CRM combination of (q 1 , q 2 , q 3 ) values that yield the best 10-precision are of the form (q 1 , 0.0, 1 − q 1 ), while the combinations that yield the best 20-precision is (0.5, 0.4, 0.1).More details are given in Figure 6 for the 5-and the 10-precision, and in Figure 7 for the 14-and the 20-precision.We can conclude, that for small values of k the k-precision increases as long as q 2 decreases.However, for high values of k (e.g.k=20), the best k-precision is yielded for high q 2 values, i.e. q 2 = 0.4.Finally, Table 7 shows the top-20 distinct pairs of classes of PhOntology.

Summarizing the Results
In most cases, (m 0 ) gave the best results.(m 4 ) gave slightly better results only for specific combinations of (q 1 , q 2 , q 3 ) for the CIDOC and the Wine Ontology (rows 1, 2 and 3 of Table 4 and rows 3 and 4 of Table 8).
Table 10 shows the average precisions (over our test collection) that were obtained from each method for k = 7, 11, 20.For the case of (m 4 ), the table shows only the combinations of q 1 , q 2 , q 3 that gave the best results.
Concerning the (m 4 ) method and comparing with CIDOC and PhOntology schemas, we observe, that in case of Wine Ontology the ignorance of the subsumption properties (i.e.q 2 = 0.0) does not yield the best (not even a good) k-precision.This is due to the fact that Wine Ontology consists of few properties and thus the subsumption relationships play a more significant role to the selection of the top-k classes.

On Generalizing the Results
As our test collection is small, we conducted some additional experiments in order to see whether the results of our comparative evaluation would be different in case we had a bigger test collection.We selected 20 ontologies (those listed at Table 12) and for each one of them we applied the formulas (m 1 ), (m 0 ) and (m 4 ).Then we compared the returned rankings, specifically to compare two ranking formulas A and B, we used the metric sim(A, B) = |CA{20}∩CB {20}| 20 .Table 12 shows the results and its last row shows the average values.We can see that the pair ((m 1 ), (m 0 )) behaves very similarly (similarity 0.9), while the pairs ((m 1 ), (m 4 )) and ((m 0 ), (m 4 )) have degree of similarity around 0.7.By making the hypotheses that the 20-precision of (m 0 ) is around 0.7 (as measured in our test collection), and that the similarity of ((m 0 ), (m 4 )) is not worse than 0.7 (as measured in this experiment), we could make the conjecture that in general the ontology sim((m1), (m0)) sim((m1), (m4)) sim ((m0)  20-precision of (m 0 ) is expected to be 0.7 ± 0.2.

Applications and Related Work 7.1 Visualization
To understand an ontology (from its RDF representation in XML) or to select (from an ontology repository) the ontology that best fits the requirements of an application, is a hard and time consuming task for a human.In this section we show how ranking can be exploited in visualization so that to alleviate this task.
For this purpose we developed a graphical editor for visualizing RDF schemas, called StarLion (this tool is part of the RDFSuite [FORTH-ICS 2005]).The graphical layout in the 2D space, is derived by a force-directed placement algorithm (specifically by adapting for the case of RDF schemas the algorithm described in [Tzitzikas and Hainaut 2005]).Figure 10 shows the layout produced for the CIDOC CRM schema.It is evident that such drawings cannot help a user to get acquainted with a schema.From our experiments, we again verified the observation (of [Tzitzikas and Hainaut 2005] for the case of ER diagrams, and of [Theoharis 2007] for the case of RDF schemas), that conceptual schemas tend to have a very connected kernel7 .This means that it is rather impossible to have a readable (and aesthetically pleasing) 2D layout for medium and large sized schemas; instead we should expect an overwhelming number of edge crossings.Ranking could alleviate this problem, as it enables deriving smaller in size (and more readable) diagrams comprising the most important concepts and relationships.The derivable top-k lists and top-k diagrams allow exploring a schema gradually: from the more important elements to the less (Figure 1 illustrates the idea).In addition, and given the inability to produce automatically aesthetically satisfying layouts for large schemas, the small (top-k) diagrams can be very useful for communication purposes.We implemented the ranking methods described above and provided the ability to visualize top-k diagrams for variable values of k.Apart from being human readable, these diagrams can be drawn much faster.Note that the force-directed placement algorithm has quadratic time complexity.For instance, the layout for Figure 10 required 25.637 seconds to compute, while the top-7 diagram only 3.024 second.An internal graph representation with double adjacency lists was employed allowing the efficient layout of the top-k diagrams for small values of k, without having to reconstruct the graph.In this way we obtain efficient drawing for values of k up to 20 (which is the most common values).
In addition, StarLion supports a semi-automatic layout process, which can greatly improve the resulting layout.Specifically, the user is allowed to nail down some nodes at desired positions and apply the layout algorithm on the rest part

Ranking Properties
For deriving the top-k diagram we first select the top-k classes, and then add all properties among them.Usually, there will be several properties between a given pair of classes.In that case, they are visualized by a single edge and a label for each property.However, one might want to reduce the property labels displayed (e.g. in Figure 11 there is one edge with more than 10 labels).We could try to rank the properties so that to have a criterion to filter them out.One can easily see that if we exploit the scores of classes for defining the scores of properties, then all properties between a given pair of classes will receive the same score, so this approach will not allow us to rank them.For this reason, we propose exploiting the specialization relationships between properties in order to differentiate between them.Specifically, we can define the score of a property p as score(p) = |subAll(p)| where subAll(p) denotes the set of all (direct and indirect) subproperties of p.The higher the score of a property is the higher this property is ranked.

Related Work
Most (or probably all) of the work that has been done so far, concerns solely the problem of ontology selection and evaluation, i.e. the problem of ranking a set of ontologies according to various criteria.This task is very useful for Semantic Web search engines and ontology libraries (like Swoogle [Ding et al. 2005], OntoSelect [Buitelaar et al. 2004], OntoKhoj [Patel et al. 2003]).For instance, OntoKhoj [Patel et al. 2003] is an ontology portal that crawls, classifies and searches ontologies.It uses the OntoRank algorithm, a PageRank-like algorithm, which instead of relying on HTML links it relies on the instantiation and subsumption links between ontologies.Analogous approaches (for ranking ontologies, not their constituents) include ActiveRank [Alani and Brewster 2005] and OntoSelect [Buitelaar et al. 2004] (the latter also considers ontology import statements).Swoogle [Ding et al. 2005] is another SW search engine that provides ontology ranking in a way similar to that of OntoKhoj.Although that work also proposes methods for ranking graphs, terms and triples, all these methods are based on the entire network of semantic web ontologies and data.This means that the results obtained by these methods are totally different from ours, as we confine ourselves to the information available in each ontology in isolation (so we do not take into account possible instantiations of the ontology or other interconnected Other somehow related works include [Zhuge and Zheng 2003], where PageRank -style ranking formulas for semantic networks are proposed but without any form of evaluation, and [Sheth et al. 2005, Anyanwu et al. 2005] which describes methods for ranking associations and paths among resources (not among schema elements).
Finally we would like to stress the fact that we have conducted an experimental evaluation based on a test collection (to the best of our knowledge no other related work has been evaluated in this way), and this allowed us to realize, the previously unexpected for us fact, that simplistic ranking methods are not worse than complicated ones.
Regarding visualization, we confined ourselves to the case of classical 2D graph plain graph layout algorithms, as our focus in on ranking, not on visual-modelers (experts) to decide that a class is significant.The investigation of which is the maximum precision that could be achieved by a ranking function based only on schema graph features could be useful in order to evaluate ranking methods (e.g., by realizing that a method is optimal).Finally, another direction of research is to devise ranking methods for OWL [Dean et al. 2002] schemas.The question here, is how to best exploit the logical expressions (e.g.unionOf and intersectionOf) that may be used in the definition of a class for ranking purposes.

Figure 1 :
Figure 1: Progressive visualization based on top-k diagrams Additionally, we would like to experiment on schemas of various values of |C| |P | , where |C|, |P | is the number of classes/properties respectively.So we selected a schema (CIDOC CRM) with |C| << |P |, i.e., a rich ontology, another (PhOntology) with |C| |P | and finally one (Wine Ontology) with |C| >> |P |, i.e., a schema that can be considered more as a taxonomy rather than an ontology.

Figure 2 :
Figure 2: An example of an RDF schema [a, b, c]  [a] =[a, a, b, c].For instance, consider the schema drawn in Figure2with three classes a, b, c, and four properties: a, p 1 , a , a, p 2 , b , b, p 3 , c , c, p 4 , b .In this case we have: conn(a) = [a, a, b], conn(b) = [a, c, c], conn(c) = [b, b].Given a binary relation R, we use T r(R) to denote its transitive closure.The bottom part of Table

Figure 10 :
Figure 10: The layout of the entire CIDOC CRM ontology Figure 11 shows the top-7 diagram, Figure 12 shows the top-11, and Figure 13 the top-20 diagram (due to ties this diagram comprises 23 classes) of CIDOC CRM.

Figure 11 :
Figure 11: The layout of the top-7 diagram of CIDOC CRM

Figure 12 :
Figure 12: The layout of the top-11 diagram of CIDOC CRM

Figure 13 :
Figure 13: The layout of the top-20 diagram of CIDOC CRM

Table 1 :
and suppose that we want to compute |C A {2} ∩ C B {2}|.It follows that C A {2} ∩ C B {2} can be one of the Schema Features of the Test Collection following: {c 2 }, ∅, {c 1 , c 2 }, {c 1 }.So the expected value of |C A {2} ∩ C B {2}| is

Definition 1 .
The graph of an RDF schema is a graph Γ = ({C∪L}, P, SC, SP ) where C is a set of nodes labeled with a class name, L is a set of nodes labeled

Table 2 :
The ideal rankings of the test collection

Table 3 :
RDF-related Notations Below we introduce a number of methods for ranking RDF schema classes.

Table 4 :
Results of the Evaluation on CIDOC CRMLet us now analyze in more detail (m 4 ).Recall that: The Recall/Fallout curve of (m 0 ) and (m 4 ) on CIDOC CRM

Table 6 :
Results of the Evaluation on PhOntology

Table 7 :
Top-20 pairs of classes of PhOntology

Table 8 :
Results of the Evaluation on Wine Ontology

Table 10 :
Average Precision

Table 11 :
Features of classes of the ideal ranking for CIDOC CRM