OWLIM-Enterprise Release Notes

Skip to end of metadata
Go to start of metadata
This documentation is NOT for the latest version of GraphDB.

Latest version - GraphDB 7.1

OWLIM Documentation

Next versions

OWLIM 5.3
OWLIM 5.4

GraphDB 6.0 & 6.1
GraphDB 6.2
GraphDB 6.3
GraphDB 6.4
GraphDB 6.5
GraphDB 6.6
GraphDB 7.0
GraphDB 7.1

Previous versions

OWLIM 5.1
OWLIM 5.0
OWLIM 4.4
OWLIM 4.3
OWLIM 4.2
OWLIM 4.1
OWLIM 4.0

New features and significant bug-fixes/updates for the last few versions are recorded here.

Version 5.2 (build 5563)

  • Fix to prevent the query optimiser choosing a sub-optimal query plan after a long sequence of insert and delete modifications. Fragmentation of storage pages was causing errors in the complexity computations.
  • Fix to prevent concurrent modification exceptions when namespaces are being updated.

Version 5.2 (build 5512)

  • Fix to prevent a memory leak due to connection references kept by the PluginManager. This also can cause a performance degradation over time.
  • Fix to dataset management that was causing explicit triples from the default (nameless) graph to be included as input to query execution when the query uses FROM or FROM NAMED and the includeInferred parameter is set to false.

Version 5.2 (build 5497)

  • Fix to prevent org.apache.lucene.store.LockObtainFailedException when incrementally updating a Lucene index. The index's configuration was being incorrectly serialised causing unpredictable behaviour.
  • Fix for missing query results when the optional predicate-lists index is switched on. With this index enabled, statements with certain predicates were being ignored.

Version 5.2 (build 5479)

  • Fix for an out of memory error that can be caused when using the query-timeout parameter.

Version 5.2 (build 5421)

  • Update to fire a JMX notification when a worker node is low on disk space

Version 5.2 (build 5331)

  • Fix the known problem that prevents custom rule files being compiled when using Java 1.7
  • Fix to avoid the stack overflow problem when optimising certain SPARQL queries that use the MINUS operator.

Version 5.2 (build 5316)

This is a maintenance release that includes Sesame 2.6.8 (change log 2.6.7 change log 2.6.8). Note that 2.6.7 is NOT backward compatible with 2.6.6 due to a couple of minor changes to interfaces. The following significant updates have been made:

  • A number of resilience improvements to cluster management
    • Better handling of out of disk space problems
    • Better communication between workers in all modes of operation
    • New JMX operation to cancel replication
  • Support for the N-Quads RDF format
  • Changes to the Plug-in SDK
    • Add transaction begin/end information to Statements.Listener interface
    • Allow for pre-processor plug-ins to modify the query inside their request
    • StatementIterator has new methods for testing read-only, explicit and implicit status
  • Improvements to the getting-started application to allow it to load very large RDF files without the need to break them in to smaller pieces
  • Improved locking (less contention) in the entity pool and when using RDF Rank
  • Cache/index statistic now always collected over JMX (attribute to switch this on/off has been removed)

Known problems:

  • Custom rule-sets will not compile when using Java 1.7 - this will be fixed in the near future with an interim update

The full set of updates for this release includes:

  • Bug
    • OWLIM-359 - Support different file formats in "imports" parameter
    • OWLIM-495 - Blank node contexts ignored by getStatements()
    • OWLIM-696 - Context parameter ignored when reading statements using HTTP protocol
    • OWLIM-767 - Improve thread synchronisation for RDF rank plug-in
    • OWLIM-776 - Cluster failure due to replicating to a worker that is out of disk space
    • OWLIM-782 - INSERT update hangs and consumes huge amount of memory
    • OWLIM-784 - Poor query performance due to query optimisation problems when using the Sesame's QueryJoinOptimizer.
    • OWLIM-804 - Owlim SE runs a query with an optional and a property path 10 times slower than Owlim Lite.
    • OWLIM-807 - Using Predicate Lists has an adverse effect on query performance - incorrect complexity estimate
    • OWLIM-813 - Slow deletion of statements using DELETE WHERE
    • OWLIM-815 - Lucene FTS functional tests failing when executed against remote repository
    • OWLIM-825 - Query timeout does not apply on certain queries
  • Improvement
    • OWLIM-697 - Add support for the NQuads RDF format
    • OWLIM-590 - Improve efficiency of RDFRank recomputation
    • OWLIM-706 - Collect and export statistics over JMX unconditionally
    • OWLIM-777 - Worker nodes should respond appropriately on system status check in any state
  • New Feature
    • OWLIM-582 - Allow for Preprocessor plug-ins to modify the query inside their request
    • OWLIM-774 - Create a read-only system named graph that will be used in the UniversalConverter to separate the schema statements from the other explicit statements.
  • Task
    • OWLIM-496 - Axiomatic statements should behave as inferred statements during query answering
    • OWLIM-788 - Add transaction information to plug-in SDK callback interface
    • OWLIM-796 - Add more methods to plug-in SDK StatementIterator to test for explicit and implicit attributes
    • OWLIM-816 - Share BufferPool instances that manage ByteBuffers of the same size
    • OWLIM-830 - Improve loading behaviour of getting started to handle huge RDF files

Version 5.1 (build 5208)

  • Fix the known problem in build 5183. An incompatibility between OWLIM and Sesame query optimisation (QueryJoinOptimizer) causes poor performance in certain circumstances. The use of QueryJoinOptimizer has been restricted to sub-select optimisation only.

Version 5.1 (build 5183)

This is a maintenance release that includes includes Sesame 2.6.6 and many fixes from interim releases made since 5.0. Repositories created with version 5.0 are binary compatible with 5.1, i.e. the OWLIM software can be updated and used with existing storage files created with version 5.0.

The following improvements have been made:

  • Axiomatic statements now treated as inferred statements during query answering
  • License file can now be set using an environment variable: OWLIM_LICENSE_FILE
  • Username and password parameters added to GettingStarted when using remote repositories with HTTP authentication
  • This release includes Sesame 2.6.6 - please see the change log for this release.

The following bugs have been addressed:

  • Full replication fails when using different platform specific local pathnames.
  • Read-only (imported) statements loose their read-only status when migrating from previous versions
  • Increased memory demand in OWLIM due to delayed finalization
  • Plugin API's statement modification methods (Statements.put/delete) don't allow modifications
  • Fixed an integer overflow bug in the compression module that results in exception whenever the overlay file grows beyond 2G.
  • Fixed a bug that can cause big query time increase when using plugins. It occurs when using the plugin triple patterns in combination with an ordinary one which has a very large collection size and is placed before the plugin triple pattern.
  • Fixed a bug that can cause incorrect query optimisation and/or a NullPointerException at query time. This problem can occur when estimating the number of matching triples for patterns containing a predicate for which there are no asserted statements in the repository.
  • Workaround to avoid unnecessary BottomUpJoinIteration(sub-selects intersection) when a sub-select is joined with an ordinary statement pattern or a join of such.
  • System statements filtered out from getContectIDs()
  • Resolved memory leaks when Updates are mixed with queries involving unbound predicate variables. That cause all unused Indexes to be kept locked.
  • Apply external bindings prior to handle query optimization. Speeds up such queries by avoiding having filters in 'AfterOptionals'
  • Collected Namespaces were not properly persisted on Windows
  • Added transactional handling of changes in properties file - fingerprints, namespaces, geometry, etc..
  • Rolled back transactions do not close transaction log files in a timely manner, leading to "too many open files" error when many rollbacks occur in sequence. Clean-up code has been relocated to ensure that it is called immediately.
  • Equivalence class updates do not close temporary files in a timely manner, leading to "too many open files" error when many transactions containing owl:sameAs statements are committed in sequence. Temporary files now removed immediately after the transaction completes.
  • Rebuild of predicate lists fails on Windows - the old files were locked and not deleted.
  • Repository lockfile is not released after failed initialisation. The new behaviour is to remove the lockfile if initialisation has failed and the lock file did not exist at the start of initialisation.
  • Schema updates should not allow removal of inferred statements.
  • Suboptimal query plan when using geospatial index
  • System contexts (from ternary relations in rule files) are visible to the Sesame workbench when browsing contexts
  • Namespaces lost when instance terminated

Known problems

The improvements in both Sesame and OWLIM for better optimisation of sub-queries have unfortunately caused a regression in query performance when using property paths with optional elements. This issue is being urgently addressed and a fix will be available very soon via an interim 5.1 release (later build number).

The problem affects queries of this form, e.g.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.com#>
SELECT * WHERE {
  ex:PersonA foaf:knows/foaf:knows?/foaf:name ?name .
}

If you experience this problem, it might be possible to re-write such queries using a UNION, e.g.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.com#>
SELECT * WHERE {
  { ex:PersonA foaf:knows/foaf:name ?name }
  UNION
  { ex:PersonA foaf:knows/foaf:knows/foaf:name ?name }
  # with more UNIONs for longer paths as necessary
}

Version 5.0

  • This version of OWLIM-Enterprise is not backwardly compatible with any previous version. This means that images created with OWLIM 4.3 and before will not work correctly with OWLIM 5.0 and must be re-created. There have been a great many modifications to the storage files, indexing structures, etc, and upgrade mechanisms have proven too complex and probably slower than re-loading the database anyway. Please do not attempt to upgrade to OWLIM 5.0 unless you drop and recreate all databases.
  • Transaction management and isolation mechanisms have been completely refactored. The previous strategy used very lazy writing of modified database pages, such that dirty pages were only flushed to disk when further updates occur and no more memory is available. While extremely fast, the problem with this approach is that there is a considerable recovery time associated with replaying the transaction log after an abnormal termination. The new mechanism uses two modes: 'bulk-loading' (fast) with similar behaviour to previous versions and 'normal' (safe) where database modifications are flushed to disk as part of the commit operation. When running in safe mode, database recovery is instant and there is a significant improvement in concurrency between updates and queries. Some related changes are:
    • There is a new parameter to control the transaction 'mode' called transaction-mode - see the configuration section
    • The database-recovery-policy configuration parameter is no longer required and has been removed
    • The special flush predicate http://www.ontotext.com/flush used to force pages to be written to disk is no longer required and has been removed. Statements using this predicate will be treated like any other statement.
    • The special reinfer predicate http://www.ontotext.com/owlim/system#reinfer used to force a re-computation of all inferences has been removed. Statements using this predicate will be treated like any other statement.
    • In fast transaction mode, the isolation constraint can be relaxed in order to improve concurrency behaviour when strict read isolation is not a requirement - this is controlled by a new parameter transaction-isolation that only has an effect in fast mode, see the configuration section
    • No recovery mechanisms are in place when running in fast mode - therefore administrators must treat an abnormal termination during bulk-loading as a fatal event and must restart the loading procedure
  • New context indices can be used to improve query performance when data is modelled using many named graphs. These are switched on and off using a single configuration parameter enable-context-index - see the configuration section
  • The SPARQL 1.1 Graph Store HTTP Protocol is now supported according to the W3C Working Draft from the 12th May 2011. This provides a REST interface for managing collections of graphs, using either directly or indirectly named graphs.
  • Sesame 2.6.5 with many bug-fixes and updates to bring SPARQL 1.1 Query support up to the latest W3C Working Draft from the 5th January 2012.
  • Significant reduction in disk-space requirements is achieved with the following modifications:
    • Index compression can now be used to reduce disk storage requirements by using zip compression on database pages. This feature if off by default, but can be switched on when creating a new repository. The configuration parameter index-compression-ratio can be set to -1 (the default value indicating no compression) or a value in the range [10-50] indicating the desired percentage reduction in page sizes. Any pages that can not be compressed by the specified amount are stored uncompressed. Therefore a compression ratio that is too aggressive will not bring many benefits. Experiments have shown that for large datasets a value of about 30% is close to optimal.
    • Restructuring of the triple indices has also led to a reduction in disk-space requirements of around 18% independent of the compression functionality
    • Entity compression is a modification that reduces the storage requirements for the lookup table that maps between internal identifiers and resources. This is transparent to the user and happens automatically. More disk space reductions are apparent using this version.
  • A new literal index is created automatically for numeric and date/time data-types. The index is used during query evaluation only if a query or a subquery (e.g. union) has a filter that is comprised of a conjunction of literal constraints, e.g. FILTER(?x >= 3 && ?y <= 5 && ?start > "2001-01-01"^^xsd:date). Other patterns, including those that use negation, will not use the index for this version of OWLIM.
  • All control queries now use SPARQL Update syntax (used mostly to control the Lucene-based full-text search, RDF Rank and geo-spatial plug-ins). This has a number of advantages, namely:
    • No special control query pseduo-graph is required by the Replication Cluster master in order to identify control queries that must be pushed to all worker nodes
    • SPARQL Updates use the corresponding SPARQL update protocol, so they can be automatically processed by load-balancers that examine URL patterns
    • It is more consistent with the SPARQL language, since these 'control queries' cause a change of state in OWLIM
  • Incremental Lucene-based full-text search index for updating the index for specific resources or all un-indexed resources. Using this technique can avoid the more expensive approach of rebuilding the whole index frequently.
  • Incremental RDF Rank allows the RDF rank for specific resources to be (re-)computed as directed by the user. This technique can avoid the more expensive approach of rebuilding all RDF Rank values frequently.
  • The Geo-spatial index has been updated to support 40-bit resource identifiers.
  • The getting started application has been restructured so that it now works with remote repositories.
  • OWLIM-Enterprise also includes the following maintenance updates and fixes:
    • Bugs
      • OWLIM-703 Getting started in OWLIM-Enterprise distribution zip has incorrect sail type in owlim.ttl
      • OWLIM-669 Sesame workbench no longer provides OWLIM options when creating a new repository
      • OWLIM-624 Entering the license file in Sesame Workbench requires backslashes to be escaped (twice)
    • New features
      • OWLIM-671 Port the custom "Update Timeout" functionality into 4.3 and 5.0 branches
      • OWLIM-636 Improved software license validation
    • Other
      • OWLIM-629 Make Getting Started more robust
      • OWLIM-522 Refactor LUBM test drivers
      • OWLIM-519 Drop Java 1.5 compatibility (change to Java 1.6)
      • OWLIM-498 Format and clean-up entire OWLIM code-base using eclipse formatter
      • OWLIM-422 Reformulate control queries to use SPARQL 1.1 Update syntax

Known problems with OWLIM 5.0 BETA 3

  • The behaviour of the 'include inferred' checkbox in the Sesame Workbench is unpredictable when using OWLIM repositories.

Version 4.3

Further contributions to the Sesame framework from Ontotext and Fluid Operations mean that Sesame version 2.6 is included with this version of OWLIM. The following new features are available:

  • SPARQL 1.1 Federation support that allows queries to pull together data from any number of distributed SPARQL endpoints
  • A new SPARQL repository type to wrap SPARQL endpoints
  • Improvements to the parser for controlling the level of literal/data-type validation and the handling of errors
  • Many other fixes for compliance with the latest revised SPARQL 1.1 working drafts

OWLIM has now has a plug-in API that allows users to build software components that alter the behaviour of OWLIM. This mechanism can be used to add new features or to improve performance in certain scenarios.

OWLIM also includes the following maintenance updates and fixes:

  • OWLIM-205 - Validate literal languages and do not allow invalid language tags to enter the repository
  • OWLIM-273 - Potential thread leak in QueryModelConverter
  • OWLIM-390 - Counting statements using Sesame API gives strange results.
  • OWLIM-419 - Make RepositoryConnection.exportStatements obey the time limit
  • OWLIM-426 - Unable to permanently remove predefined namespace definitions
  • OWLIM-428 - Explicit axioms don't show up as explicit if they have been inferred before by other axioms
  • OWLIM-463 - Clear transaction log in replication cluster if it cannot be initialized
  • OWLIM-466 - SesameConnectionImpl.getStatements must return quads, not trips (breaks workbench explore)
  • OWLIM-470 - Query with Union and optional returns wrong results
  • OWLIM-471 - Can not access new repository when FTS switched on (divide by zero or lockfile locked)
  • OWLIM-473 - onto:explicit pseudo-graph does not prevent implicit statements as input for query answering
  • OWLIM-475 - Repackaged console.sh in openrdf-console.zip has lost its execute attribute
  • OWLIM-476 - Neither of the slf4j jars (api or jdk14) are needed in the war files
  • OWLIM-483 - Lost solutions to queries with FROM <...> clause
  • OWLIM-485 - Repository with many transactions fails to get restored
  • OWLIM-488 - Incorrect behaviour of FROM and FROM NAMED in SPARQL queries
  • OWLIM-489 - Predicate list indices do not log statistics
  • OWLIM-490 - User-supplied Dataset object on query not properly handled
  • OWLIM-491 - Query rewriting in MainQuery.convertToOptimizedForm() converts OR to AND in filters when converting the condition to disjunctive normal form
  • OWLIM-495 - Blank node contexts ignored by getStatements()
  • OWLIM-501 - Lucene and OPTIONAL query bug
  • OWLIM-502 - The database restorer deletes the pso and pos files after second unsuccessful restore
  • OWLIM-457 - Validate data-type values at load time
  • OWLIM-497 - Update getting-started and add timestamps
  • OWLIM-356 - Optimized rule set is not compatible with the rule compiler.
  • OWLIM-480 - Make use of the com.ontotext.trree.collections for the predicate map in order to reuse the file header and the common interface

Version 4.2

Ontotext have continued to invest in the Sesame project and are pleased to announce the inclusion of Sesame version 2.5 with this version of OWLIM. The benefits include:

  • SPARQL 1.1 Update - this extension of SPARQL provides a much more powerful method to modify RDF databases without the requirement for developers to use frameworks and APIs.
  • SPARQL 1.1 Query conformance has been updated to the May 2011 working draft, i.e. all the remaining behaviour has been implemented along with all the new SPARQL filter functions.
  • The SPARQL protocol has also been updated to January 2010 working draft.
  • A new binary RDF serialization format. This format has been derived from the existing binary tuple results format. It's main features are reduced parsing overhead and minimal memory requirements.

As well as integration with the new Sesame APIs and modifications for optimising SPARQL Update, there have also been a number of bug fixes in this version of OWLIM-Enterprise:

  • OWLIM-396 - A RuntimeException is thrown in clearNamespaces() in SailConnection
  • OWLIM-404 - HashEntityPool fails to store/read its entity index table if its size is more than ~500M
  • OWLIM-408 - Getting of default namespace doesn't work
  • OWLIM-440 - Can not create geo-spatial index when using OWLIM-SE with Tomcat
  • OWLIM-443 - Repository fails to start - entity pool error
  • OWLIM-445 - disable-sameAs causing query evaluation to lose bindings
  • OWLIM-446 - Query.setIncludeInferred() is ignored
  • OWLIM-447 - License file can not be specfied - default evaluation license is always used.
  • OWLIM-449 - Wrong conversion from int to long in com.ontotext.trree.plugin.lucene.LuceneIterator
  • OWLIM-452 - Multiple wrong results are returned for a CONSTRUCT query
  • OWLIM-454 - EntityStorageVersion3 fails to restore if a long entity has negative size.
  • OWLIM-455 - Cannot put any more statements in AVL tree after ~3.1B statements added during 3.5-to-4.0 conversion
  • OWLIM-305 - Rationalise OWLIM vocabulary

Version 4.1

This maintenance release includes Sesame 2.4.2, which fixes several important bugs in SPARQL 1.1 Query support:

Also included are some updates to OWLIM-SE:

  • Unexpected binding returned in a Sparql query with union within an optional expression
  • FILTER in OPTIONAL patterns returns incorrect results
  • Aggregate SPARQL query fails with IndexOutOfBoundsException
  • Default and named graphs set in a SPARQL query are ignored by the Jena connector

Version 4.0

  • OWLIM Replication Cluster has been renamed to OWLIM-Enterprise and is distributed separately from OWLIM-SE. This new name better identifies this software component as the flagship product of the OWLIM family suitable for mission critical applications.
  • Easy to deploy WAR files: The distribution now includes openrdf-sesame and openrdf-workbench Web applications pre-configured with OWLIM and ready to deploy. This makes installing OWLIM as a server and creating/administrating OWLIM repositories trivially simple. The WAR files can be found in the sesame_owlim directory of the distribution ZIP file. See 'easy install' in the installation section.
  • SPARQL 1.1 Query: Ontotext has invested significant development resources in the Sesame project in order to bring SPARQL 1.1 support to all editions of OWLIM. Since OWLIM-Enterprise is a distributed architecture based on OWLIM-SE, OWLIM-Enterprise also includes SPARQL 1.1 Query, but without federation support for the moment. SPARQL 1.1 Update support will be included in the next release. The new features include:
    • Aggregates
    • Subqueries
    • Negation
    • Expressions in the SELECT clause
    • Property Paths
    • Assignment
    • A short form for CONSTRUCT
    • An expanded set of functions and operators
  • The SPARQL 1.1 specification has not yet become a W3C recommendation and continues to evolve. The following known issues apply to this release of OWLIM and Sesame:
    • fn:concat is not supported. This was added to the working draft in May, just after the Sesame 2.4.0 release was finalised. It will likely be included in the next Sesame/OWLIM release.
    • Empty IN() and NOT IN() clauses will cause an exception - will be fixed in the next release.
    • Using the aggregate function SUM() will cause an exception if the there are no bindings over which to do the summation - will be fixed in the next release.
    • Federation is not yet supported. This will be implemented in a later version of Sesame and OWLIM later this year.
    • There are some problems with complex expressions in the SELECT clause. This should be fixed in the next release of Sesame/OWLIM.

Version 3.5

This release includes many bug fixes, several new features and updates:

  • Write-only worker node: When worker nodes are added to the cluster via the JMX interface, they can be specified as being 'write-only'. These nodes will be kept in synch with the rest of the cluster, but will not take part in answering cluster queries. The motivation for this feature is to have one or more worker nodes available for batch processing of queries that do not affect the overall query performance of the cluster.
  • Remote notifications: A new mechanism to complement the existing high-performance 'in-process' notification mechanism. This new mechanism allows clients to subscribe for the given statement patterns to OWLIM Replication Cluster master nodes.
  • Online documentation: As well as the PDF format user guides included in the OWLIM distribution zip files, the latest documentation for all editions of OWLIM is now available online.

Version 3.4

  • Replication cluster introduced in this version of OWLIM: brings resilience, failover and horizontally scalable parallel query processing. A master node component is included that can manage a cluster of worker nodes (standard BigOWLIM instances) to synchronise updates, cater for node failure, dynamically add/remove worker nodes and distributed query requests. Such a setup allows for massive concurrent query performance where the number of queries processed per second scales almost linearly with the number of worker nodes
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.