OWLIM-Enterprise Basics

Skip to end of metadata
Go to start of metadata
Search
This documentation is NOT for the latest version of GraphDB.

Latest version - GraphDB 7.1

OWLIM Documentation

Next versions

OWLIM 4.1
OWLIM 4.2
OWLIM 4.3
OWLIM 4.4
OWLIM 5.0
OWLIM 5.1
OWLIM 5.2
OWLIM 5.3
OWLIM 5.4

GraphDB 6.0 & 6.1
GraphDB 6.2
GraphDB 6.3
GraphDB 6.4
GraphDB 6.5
GraphDB 6.6
GraphDB 7.0
GraphDB 7.1

An OWLIM-Enterprise cluster is organised as one or more master nodes that manage one or more worker nodes. Fail-over and load-balancing between worker nodes is automatic. Multiple (stand-by) master nodes ensure continuous cluster performance even in the event of a master node failure. The cluster deployment can be modified when running, which allows worker nodes to be added during peak times, or released for maintenance, backup, etc.Such a set up guarantees a high performance always on service that can handle millions of query requests per hour.

The OWLIM-Enterprise cluster master manages and distributes atomic requests (query evaluations and update transactions) to a set of OWLIM instances.

Master Node

The Master node works in two modes - 'Normal' and 'Read-only'. The difference between these is that update requests are processed only when the Master node is in the 'Normal' mode of operation, meaning that no issues with the registered nodes have been detected, such as not responding to status queries, HTTP errors, invalid fingerprint etc.
Cluster configurations may include two Master nodes, in which case one of the Master nodes processes updates, while the other must be used only as a hot-spare in case the first Master node fails and does not respond to HTTP requests. If such a situation occurs, the cluster will process only read requests (e.g. query evaluations) via the hot-spare until the main Master is running again or the standby master is switched in to 'Normal' mode.
The Master node of an OWLIM-Enterprise cluster implements the Sesame Repository interfaces. However, it does not store any RDF data itself, rather its function is to route queries and update requests to a set of OWLIM-SE instances (nodes).

Worker nodes

Worker nodes are OWLIM-SE repositories configured with identical rule-sets hosted in the openrdf-sesame Web application running in a Java servlet container, such as Tomcat. These are accessible to the Master node via the HTTP protocol of the exported SPARQL endpoint of the Sesame service.

Read requests (query evaluations)

Every read request (SPARQL query) is passed to one of the available worker nodes. The node is chosen based on runtime statistics: number of queries currently running on that node and average query evaluation time. The available nodes are organised in a priority queue which is rearranged after every read operation. If an error occurs (time-out, lost connection, etc) the request is resent to one of the other available nodes and so on until it is successfully processed.

Updates (add, remove and combined add/remove transactions)

Updates are handled in the following way: Each update is first logged. Then a probe node is selected and the request is sent to it. If the update is successful, a set of control queries are evaluated against it in order to check its consistency with respect to the data it holds. If this control suite passes, the update is forwarded to the rest of the nodes in the cluster. During the handling of an update the nodes that do not successfully process it are disabled and scheduled for recovery replication.
The master node can be configured with a set of control SPARQL queries, which may be only CONSTRUCT or ASK queries. The CONSTRUCT queries, when evaluated, check for the presence of the statements generated from the query. If any of these statements are missing, then a condition is raised and the control suite is considered to have failed. An example of a control query could be the check for proper values respecting the rdfs:range of an arbitrary property, e.g.

An example CONSTRUCT control query


The above query is somewhat redundant if the rule-set has such an inference rule included, which is the case with the built-in rule-sets currently distributed with OWLIM (such as "rdfs", "owl-horst" and "owl-max").
The Ask queries raise conditions in the case that they return true (i.e. there is a sub-graph fulfilling the statement patterns of the query). An example of such a control query is the check for common members of mutually disjoint classes, e.g.

An example ASK control query


The set of control queries is evaluated against the node on which the update was probed - if these queries pass on that node, then the update is considered 'safe' and is then forwarded to the remaining nodes in the cluster.
The file with the control queries can be preset using the <http://www.ontotext.com/trree/owlim#queriesFile> configuration parameter of the master node. The file format is the same as the one used in the GettingStarted demo application included with the OWLIM distribution.

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.