Published December 1, 2009 | Version v1
Journal article Open

Efficient Algorithm for Computing Link-Based Similarity in Real World Networks

Description

Similarity calculation has many applications, such as information retrieval, and collaborative filtering, among many others. It has been shown that link-based similarity measure, such as SimRank, is very effective in characterizing the object similarities in networks, such as the Web, by exploiting the object-to-object relationship. Unfortunately, it is prohibitively expensive to compute the link-based similarity in a relatively large graph. In this paper, based on the observation that link-based similarity scores of real world graphs follow the power-law distribution, we propose a new approximate algorithm, namely Power-SimRank, with guaranteed error bound to efficiently compute link-based similarity measure. We also prove the convergence of the proposed algorithm. Extensive experiments conducted on real world datasets and synthetic datasets show that the proposed algorithm outperforms SimRank by four-five times in terms of efficiency while the error generated by the approximation is small.

Files

article.pdf

Files (832.0 kB)

Name Size Download all
md5:756163a7b5a3a6dfa18710dfdeabf51c
832.0 kB Preview Download