GraphDB-SE RDF Rank

compared with
Current by reneta.popova
on Aug 26, 2014 18:00.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (4)

View Page History
h1. Introduction

RDF Rank is an algorithm that algorithm, which identifies the more important or more popular entities in the repository by examining their interconnectedness. The popularity of entities can then be used to order query results in a similar way to the internet search engines, such as how Google orders search results using PageRank [http://en.wikipedia.org/wiki/PageRank].
The RDF Rank component computes a numerical weighting for all the nodes in the entire RDF graph stored in the repository, including URIs, blank nodes and literals. The weights are floating point numbers with values between 0 and 1 that can be interpreted as a measure of a node's relevance/popularity.
Since the values range from 0 to 1, the weights can be used for sorting a result set (the lexicographical order works fine even if the rank literals are interpreted as plain strings). Here is an example SPARQL query that uses RDF rank for sorting results by their popularity:
{noformat}

will compute computes RDF Rank values for those resources that resources, which do not have an associated value, i.e. those that have been added to the repository since the last full RDF Rank computation.

{info}
The incremental computation uses a different algorithm that algorithm, which is lightweight (in order to be fast), but is not as accurate as the proper ranking algorithm. As a result of this, ranks assigned by the proper and the lightweight algorithms will diverge slightly from each other.
{info}

{noformat}

If the export failed fails then the update will throw throws an exception and an error message will be recorded in the log file.

Lastly, when using [RDF Priming|GraphDB-SE Experimental Features#RDF Priming], the RDF Rank values can be used as the initial activation values. To set this up, use the following update: