The major differences between OWLIM-SE and OWLIM-Lite are their performance and scalability. Both OWLIM editions deliver identical functionality for RDF storage, inference and query answering and they both implement Sesame's SAIL APIs, as discussed in section 4. This guarantees that all essential functions of a semantic repository are supported by OWLIM in a standard, consistent, and interoperable manner.
In the 'do more' category, OWLIM-SE delivers functionality that is not exposed by the Sesame API. Typically, this is achieved with the use of special-purpose system predicates. One should be aware that using the 'do more' features will affect compatibility with other semantic repositories.
OWLIM-SE stores all of its data (statements, indexes, entity pool, etc) in files in the configured storage directory, usually called 'storage'. The content and names of these files is not defined and is subject to change between versions. In general, the index structures used in OWLIM-SE are chosen and optimised to allow for efficient:
OWLIM-SE maintains two main indices on statements for use in inference and query evaluation, these are the predicate-object-subject (POS) index and the predicate-subject-object (PSO) index. There are many other additional data structures that are used to enable the efficient manipulation of RDF data, but these are not listed since these internal mechanisms cannot be configured.
Certain data-sets and certain kinds of query activities, for example queries that use wild-card patterns for predicates, benefit from another type of index called a 'predicate list'. This index maps from entities (subject or object) to their predicates. This index is not switched on by default (see enablePredicateList in section 8.5), because it is not always necessary. Indeed, for most datasets and query loads the performance of OWLIM-SE without such an index is good enough even with wild-card-predicate queries, and the overhead of maintaining this index are not justified. One should consider using this index for datasets that contain a very large number (~1000) different predicates.
There are two more optional indices that can be used to speed up query evaluation when searching statements via their context or tripleset identifier. These indices are the PCSOT and the PTSOC indices and can be switch on independently. See the build-pcsot and build-ptsoc parameters in section 8.5.
Transaction support is exposed via Sesame's RepositoryConnection interface. The three methods of this interface that give the client control over when updates are committed to the repository are shown below:
OWLIM-SE supports the so called 'read committed' transaction isolation level, well known to relational database management systems. It guarantees that changes will not impact query evaluation, before the entire transaction they are part of is successfully committed. It does not guarantee that execution of a single transaction is performed against a single state of the data in the repository. Regarding concurrency:
One should note that OWLIM performs materialization, making sure that all the statements which can be inferred from the current state of the repository are indexed and persisted (except for those compressed due to the owl:sameAs optimisation, described in section 7.5). When the commit method completes, all reasoning related activities related to the changes in the data introduced by the corresponding transaction will have already been performed.
As already described, OWLIM-SE applies the inference rules at load time in order to compute the full closure. Therefore a repository will contain some statements that are explicitly asserted and other statements that exist through implication. In most cases clients will not be concerned with the difference, however there are some scenarios when it is useful to work with only explicit or only implicit statements. The following sections describe how these two groups of statements can be isolated during programmatic statement retrieval using the Sesame API and during (SPARQL) query evaluation.
The usual technique for retrieving statements is to use the RepositoryConnection method:
The method retrieves statements by 'triple pattern', where any or all of the subject, predicate and object parameters can be null to indicate 'wild cards'.
OWLIM-SE also provides mechanisms to differentiate between explicit and implicit statements during query evaluation. This is achieved by associating statements with two pseudo-graphs (explicit and implicit) and using special system URIs to identify these graphs. Full details can be found in the advanced features section.
Skip to end of metadata Go to start of metadata