OWLIM-SE Query Behaviour

Version 1 by barry.bishop
on Nov 03, 2011 16:45.

compared with
Version 2 by Jeen Broekstra
on Feb 01, 2012 21:57.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (21)

View Page History
* Property Paths
* Assignment
* An expanded set of functions and operators

*SPARQL 1.1 Update* provides a means to change the state of the database using a query-like syntax. SPARQL Update has similarities to SQL INSERT INTO, UPDATE WHERE and DELETE FROM behaviour.

*SPARQL 1.1 Federation* provides extensions to the query syntax for executing distributed queries over any number of SPARQL endpoints. This new feature from Sesame 2.6 is very powerful, but must be used with caution. The following example finds resources in the second SPARQL endpoint that have a similar rdfs:label to the rdfs:label of <http://dbpedia.org/resource/Vaccination> in the first SPARQL endpoint:
*SPARQL 1.1 Federation* provides extensions to the query syntax for executing distributed queries over any number of SPARQL endpoints.This new feature from Sesame 2.6 is very powerful, and allows integration of RDF data from different sources using a single query.

For example, imagine we have a repository filled with data about persons, and we want to find out, for each person in our repository, whether a DBPedia resource about a person with the same name exists:

{noformat}
SELECT ?dbpedia_id
WHERE {
?person a foaf:Person ;
foaf:name ?name .
SERVICE <http://dbpedia.org/sparql> {
?dbpedia_id a dbpedia-owl:Person ;
foaf:name ?name .
}
}
{noformat}

The above query matches the first part against our own repository, and for each person it finds, it checks the DBPedia SPARQL endpoint to see if a person with the same name exists and if so, return the id.


Since Sesame repositories are also SPARQL endpoints, you can also use the federation mechanism to do distributed querying over several repositories on your own server. For example, imagine we have two repositories. one repository (called 'my_concepts') with triples about concepts, and one separate repository (called 'my_labels') that contains all the label information. We can then get back the corresponding label for each concept by executing the following query on the 'my_concepts' repository:


{noformat}
SELECT ?id ?label
WHERE {
?id a ex:Concept .
SERVICE <http://localhost:8080/openrdf-sesame/repositories/my_labels> {
?id rdfs:label ?label.
}
}
{noformat}
Federation must be used with caution, first of all to avoid doing excessive querying of remote (public) SPARQL endpoints, but also because it can lead to inefficient query patterns. The following example finds resources in the second SPARQL endpoint that have a similar rdfs:label to the rdfs:label of <[http://dbpedia.org/resource/Vaccination]> in the first SPARQL endpoint:

{noformat}
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>

}
BINDINGS ?endpoint1_id
{ ( <http://dbpedia.org/resource/Vaccination> ) }
{noformat}

The SPARQL query syntax provides a means to execute queries across default and named graphs using FROM and FROM NAMED clauses. These clauses are used to build an 'RDF Dataset' that identifies what statements the SPARQL query processor will use to answer a query. The dataset contains a default graph and named graphs and is constructed as follows:

* *FROM uri* - \- brings statements from the database's graph identified by 'uri' in to the dataset's default graph, i.e. the statements 'lose' their graph name
* *FROM NAMED uri* - \- brings the statements from database's graph identified by 'uri' in to the dataset, i.e. the statements keep their graph name

If either FROM or FROM NAMED are used then the database's default graph is no longer used as input for processing this query. In effect, the combination of FROM and FROM NAMED clauses exactly defines the dataset. This is somewhat bothersome, as it precludes the possibility, for instance, of executing a query over just one named graph and the default graph. However, there is a programmatic way to get around this limitation that is described below.

|| Clause || Behaviour ||
| {{FROM <}}{{[http://www.ontotext.com/explicit]}}{{>}} | The dataset's default graph will include only explicit statements from the database's default graph |
| {{FROM <}}{{[http://www.ontotext.com/implicit]}}{{>}} | The dataset's default graph will include only inferred statements from the database's default graph |
| {{FROM NAMED <}}{{[http://www.ontotext.com/explicit]}}{{>}} | The dataset will contain a named graph called [http://www.ontotext.com/explicit] that contains only explicit statements from the database's default graph, i.e. quad patterns such as \{GRAPH ?g \{?s ?p ?o\} will bind to explicit statements from the database's default graph with a graph name of <[http://www.ontotext.com/explicit]> |
| {{FROM NAMED <}}{{[http://www.ontotext.com/implicit]}}{{>}} | The dataset will contain a named graph called [http://www.ontotext.com/implicit] that contains only implicit statements from the database's default graph |

Note that these are only flags and do not affect the construction of the default dataset in the sense that using any combination of the above will still result in the dataset containing all the named graphs from the database. All that is changed is which statements appear in the dataset's default graph and whether any extra named graphs (explicit or implicit) appear.
\\
PREFIX ent: <[http://www.ontotext.com/owlim/entity#]> \\
SELECT * WHERE \{ \\
?s ent:id ?id
?s ent:id ?id \\
\} ORDER BY ?id ||

PREFIX ent: <[http://www.ontotext.com/owlim/entity#]> \\
SELECT * WHERE \{ \\
?s ?p ?o . \\
\} order by ent:id(?o) ||

\\
|| Clause || Behaviour ||
| {{FROM/FROM NAMED <}}{{[http://www.ontotext.com/disable-sameAs]}}{{>}} | Used to switch off the enumeration of the equivalence classes produced by owl:sameAs during triple pattern matching, which is the default behaviour, so that solutions followed by these are excluded. Its purpose is to reduce the number of results to only those that are valid for a single representative of the class (this is a rough description and not fully explanatory). For example, given a triple that matches a pattern: {{test:Inst rdf:type, test:SomeClass}} and {{test:Inst}} is {{owl:sameAs}} to {{test:Inst2}} then, by default there would be 2 triples matching the pattern, one for {{test:Inst}} and another for {{test:Inst2}}. Using the above system graph in {{FROM/FROM NAMED}} clauses excludes such redundancies. BE AWARE that if the query uses filters over the textual representation of a node that modifier may skip some valid solutions since not all the nodes within an equivalence class will be matched against such a {{FILTER}}. |
| {{FROM/FROM NAMED <}}{{[http://www.ontotext.com/count]}}{{>}} | Will trigger the evaluation of the query so that it will give a single result in which all the variable bindings in the projection will be replaced with a plain literal holding the value of the total number of solutions of the query, i.e. the equivalent of COUNT\(*) from SQL. In the case of a CONSTRUCT query in which the projection contains three variables (?subject, ?predicate, ?object), the subject and the predicate will be bound to {{<}}{{[http://www.ontotext.com/]}}{{>}} and the object will hold the literal value. This is because there cannot exist a statement with literal in the place of the subject or predicate. |
| {{FROM/FROM NAMED <}}{{[http://www.ontotext.com/skip-redundant-implicit]}}{{>}} | Will trigger the exclusion of implicit statements when there exists an explicit one within a specific context(even default). Initially implemented to allow for filtering of redundant rows where the context part is not taken into account and which leads to 'duplicate' results. |
| {{FROM <}}{{[http://www.ontotext.com/distinct]}}{{>}} | Using this special graph name in {{DESCRIBE}} and {{CONSTRUCT}} queries will cause only distinct triples to be returned. This is useful when several resources are being described, where the same triple can be returned more than once, i.e. when describing its subject and its object. |
| {{FROM <}}{{[http://www.ontotext.com/owlim/cluster/control-query]}}{{>}} | Identifies the query to the OWLIM-Enterprise cluster master node as needing to be routed to all worker nodes. |