Date: November 19, 2012
Subject: Announcing OWLIM 5.3 - new workbench, improved monitoring and control, nested repositories, online backup and more
Ontotext are pleased to announce the release of OWLIM version 5.3 that includes Sesame 2.6.10 and a variety of new features and updates. This version is completely compatible with any OWLIM database created with OWLIM since version 5.0.
Users of OWLIM-SE now have a new deployment option in the form of a standalone, ready-to-run OWLIM-Workbench, which combines Sesame and OWLIM with a jetty server, and includes an administration interface built using the Forest framework. This framework was developed internally at Ontotext and is used to build services such as FactForge and LinkedLifeData.
The OWLIM-Workbench provides a super set of features currently provided by the bundled Sesame Workbench, including:
- Security: Easier to configure HTTP authentication, administration of user accounts and repository level access control
- Repository management and configuration editing
- Data loading functionality
- Query execution using SPARQL syntax highlighting and results inspection
- Data exploration and export
- System information reporting
'Nested repositories' is an experimental OWLIM-SE feature for 'stacking' or 'sharing' repositories suitable for situations in which a large collection of reference data (perhaps some collection of linked open datasets) are shared by multiple OWLIM instances. Rather than store the large reference data in each specialised repository, it can just be referenced and will be used for query answering and for computing inferences with local data in the specialised repository.
Monitoring and control functions have been expanded with the ability to terminate a runaway update transaction. SPARQL will allow the construction of updates that can be cause arbitrarily large numbers of new statements to be inserted. In such a case, it is possible to instruct OWLIM to abort a long running update and rollback to the previous state. Better query logging is also provided for examining execution plans and on-the-fly optimisations.
The functionality of OWLIM-Enterprise has been expanded in two ways. Firstly, convenient backup and restore methods have been added to the JMX interface. These use the replication function to copy a worker node database image to a directory on the cluster master node for making a backup, and also to copy an image from the master node to all attached workers when restoring from a backup. Also, the 'remote replication' feature allows two clusters to remain in synch. This can be useful when maintaining a disaster recovery cluster instance, because it allows the master in the remote cluster to appear like a normal worker node and receives updates from the master in the main cluster.
Other improvements include:
- A performance degradation when loading very large datasets has been fixed. After a few billion statements load performance started to drop and by around 6 billion statements performance was four times slower than it should be. After the fix the data loading speed at 20 billion statements is around 50% slower than with an empty database.
- It is now possible to put a global limit on the number of query results per query. Any queries that generate more results will have the remainder truncated. This feature can be useful for any public-facing SPARQL endpoints.
- The Jena adapter layer for OWLIM-SE has been updated to version 2.7.3. This is also reflected in the TopBraid Composer plug-in.
- Consistency checks in OWLIM-SE and OWLIM-Enterprise are now strictly enforced. If consistency checking is enabled, data that causes an inconsistency will not be allowed and an update transaction containing an inconsistency will abort and rollback to the previous database state. For example, if using the OWL2-RL ruleset an attempt to declare an individual as being a member of two disjoint classes will trigger a rollback.
For a full list of updates, see the release notes for the relevant edition of OWLIM.
The “Nested Repositories” feature is a perfect match for the RS requirement of one shared dataset, and per-project datasets.
- It allows something better: sharing datasets in different per-project configurations, e.g.:
project Rembrandt uses data by RKD, BM, NGA, and own project data
project Cranach uses data by Gemaldegalerie, NGA, and own project data
- It doesn’t use Named Graphs, which allows us to use Named Graphs for something else.
(E.g. Josh’s idea to use them per-object, to enable easier updating)
RS already uses OWLIM 5.3, with some patches by Mitac to enable “Focused FTS indexing”.