OWLIM-Enterprise Release Notes

Skip to end of metadata
Go to start of metadata
Search
This documentation is NOT for the latest version of GraphDB.

Latest version - GraphDB 6.6

OWLIM Documentation

Next versions

[OWLIM 5.6]
GraphDB 6.0 & 6.1
GraphDB 6.2
GraphDB 6.3
GraphDB 6.4
GraphDB 6.5

Previous versions

OWLIM 5.4
OWLIM 5.3
OWLIM 5.2
OWLIM 5.1
OWLIM 5.0
OWLIM 4.4
OWLIM 4.3
OWLIM 4.2
OWLIM 4.1
OWLIM 4.0

New features and significant bug-fixes/updates for the last few releases are recorded here. Full version numbers are given as:

major.minor.build_number

e.g. 5.3.5928 where the major number is 5, the minor number is 3 and the build number is 5928. Releases with the same major and minor version numbers do not contain any new features, the only difference is that releases with later build numbers contain fixes for bugs discovered since the previous release. New or significantly changed features are released with a higher major or minor version number.

Version 5.5 (build 7071)

A new cluster version with improved delete and update speeds; some minor bug fixes are merged from 5.4 version. This 5.5 is based on Sesame-2.7.8 - the same version that is in 5.4.

  • IMPROVEMENT: Update speed of small transactions increased on large (500m) datasets. The speed gain is 2x-6x. The impact on smaller datasets is between 50-100% (LDBC-50m). The indexes use different format, which is not backward compatible with OWLIM-5.4, but automatic conversion takes place, when an old image is opened with 5.5. [W-25]
  • IMPROVEMENT: Improved aggregation speed in some specific cases, e.g. SELECT COUNT queries without filters.
  • IMPROVEMENT: Further optimizations in the "optimized rulesets" - rdfs-optimized, owl-horst-optimized, etc. As described in the documentation, the optimized rule-sets avoid some features in RDFS and OWL specification that result in fairly inefficient inference, without adding value for a wide range of applications. The "optimizations" were further developed in 5.5, by removing the rdfs:range axiom for rdf:type predicate. Tests show ~25% improvement of the update speed on LDBC-50m.
  • IMPROVEMENT: Before 5.5 in many cases, when the schema (ontology) is updated, OWLIM performs very slow. Opitimizations were implemented to resolve this problem; those also resulted in improved speed of deletion when many rdf:type statements are being deleted. [OWLIM-1435]
  • FIX: "Connection reset" in cluster causes Worker nodes to be OUT-OF-SYNC [B-98]
  • FIX: Surrounding FILTER clause with braces changes the result of query [W-26]
  • FIX: Under heavy R/W load when Lucene plugin is used, addToIndex() could cause synchronization problems (AlreadyClosedException inside Lucene) [OWLIM-1442]

Version 5.4 (build 6863)

This is a cluster maintenance version with improved Master-Worker synchronization protocol and improved logging. Sesame was upgraded to 2.7.8.

  • Upgrade to Sesame 2.7.8 (See its release notes here: https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=11300)
  • IMPROVEMENT: Owlim-Ent uses SPARQL over HTTP instead of raw sockets for communication
  • IMPROVEMENT: Logging: Query log, slow query log; monitor & reload logback configuration by default
  • FIX: Drop graph causing statement to be deleted from another graph
  • FIX: Sesame HTTP Client rewritten (part of cluster protocol improvement)
  • FIX: CLEAR GRAPH <graph>. Not removing all statements
  • FIX: Query optimizer takes ~1s for certain queries
  • FIX: owl2-rl/sameAs problem in LDBC
  • FIX: query returns non-existent data. In the case of

    we always return bindings related to 'a' (the others are ignored) due to an issue related to binding propagation outwards the UNION.

Version 5.4 (build 6590)

This is a patch for cluster/replication problem for Owlim-Ent. It also contains a new property for the number of incremental updates allowed and a Sesame Workbench "Explore" improvement. Here is the list:

  • FIX: critical bug within the ReplicationCluster that could cause data loss in case of unsuccessful full replications (OWLIM-1294)
  • FIX: a workaround to WILEY-20 "RepositoryException: attempt to unlock read lock, not locked by current thread"
  • IMPROVEMENT: new MBean attribute exposed to the ReplicationCluster named 'IncrementalUpdateLimit' with default value of 10. It controls the number of updates we could consider so to decide whether to do Incremental vs Full replication (provided the out-of-synch node is in some known state)
  • IMPROVEMENT: The performance of the Sesame Workbench "Explore" feature was improved in the case of missing context indexes (which is the default in Owlim). OWLIM-1318: getContextsIDs may take longer on large datase if context indices are not enabled
  • FIX: minor bugfixes to the experimental 'lucene2' plugin

Version 5.4 (build 6486)

This release provides a number of fixes and improvements over the previous release. Most importantly there are improvements for page allocation and deallocation during transactions, performance improvements for use-cases with huge amounts of DELETE and DROP GRAPH statements, and other improvements on the efficiency of query execution. The OWLIM Enterprise Cluster should be more responsive under huge load, there is a new dedicated HTTPClient connection reserved for system requests (Master Node - Worker Nodes communication) and those system requests do not interfere with the other user query requests. Also write queries are executed across the cluster in two steps only (instead of three steps).

This release is bundled with Sesame 2.7.7 which provides better control over transactions compared to Sesame 2.6. Transactions are now started with a call to begin() rather than implicitly when an update operation is started. If begin() is not used, then the behaviour reverts to what was previously called 'auto-commit', i.e. any update operation is committed immediately. HTTPRepository now supports background parsing/concurrent reading of results. The release notes for the last few versions of Sesame are here:

NOTE: (Sesame 2.7.4 and above) Although this Sesame release is classified as a minor release (indicating compatibility with earlier releases), users of SPARQL are advised of one change that is not backward compatible with earlier releases: support for the '{min, max}' property path length syntax has been removed from the SPARQL parser, in accordance with its earlier removal from the official SPARQL specification. See issue SES-1706 for details. If you are currently using this particular syntax construct in your SPARQL queries, you are advised to modify those queries before upgrading.

The bundled Apache Lucene library is updated to version 3.6.2 in preparation for a new full-text search plug-in that will be published in the near future. Several extensions have been made to OWLIM's plug-in API to support this, as well as to allow closer integration with some text analytics components licensed separately from Ontotext.

The full set of updates for this release include:

  • IMPROVEMENT: Delete performance degradation after deleting large amounts of triples.
  • IMPROVEMENT: Inefficiency in query execution plan and optimiser: IN operator.
  • IMPROVEMENT: Page allocation/deallocation optimisation for transaction control.
  • IMPROVEMENT: Cluster writes are now executed in two steps (instead of three steps).
  • IMPROVEMENT: Create a view to manipulate the Sesame namespaces.
  • IMPROVEMENT: Transaction management for SPARQL updates.
  • IMPROVEMENT: Update configuration parameters.
  • IMPROVEMENT: Multi-threaded BSBM test not use 100% CPU.
  • IMPROVEMENT: Define and implement correct behaviour from OWLIM explicit and implicit graphs.
  • IMPROVEMENT: Add a query string field to the TrackRecord class for the JMX interface.
  • FIX: Out of Sync events but No of triples does not show as -1 - cluster might never sync.
  • FIX: Using luc:addToIndex causes cluster to go out of sync.
  • FIX: Plugin for TopBraid Composer 4.x does not install/initialise properly.
  • FIX: Consistency checks fail.
  • FIX: Problem with "a owl:sameAs b" and "a != b" constraint.
  • FIX: StackOverflow error while evaluating query.
  • FIX: Some strange message is being logged - WARNING: tried to remove unknown connection object from store.
  • FIX: After Sesame 2.7.5 upgrade, DESCRIBE stops working.
  • FIX: Drop GRAPH causing statement to be deleted from another graph.
  • FIX: luc:addToIndex causing StackOverflowError.
  • FIX: Performance of query with statements contained in GRAPH expression.
  • FIX: "Explore" doesn't work for stores with large amounts of data.
  • FIX: Large number of SPARQL filter parameters causes stack trace.
  • FIX: Failing SPARQL query.
  • FIX: XMLSchema#date not valid? Error while handling request (500): Query evaluation error: Malformed query result from server.
  • FIX: Issue with DELETE of a context.
  • FIX: System hangs on clearing the documents.
  • FIX: FTS luc:excludePredicates, includePredicates don't work properly.
  • FIX: Deeper FTS molecule (moleculeSize="3") not collected.
  • FIX: Query fails with "Cannot retrieve literal with ID of #" after previous RTE "Could not flush entity storage".
  • FIX: The built-in FTS crashes with ArrayIndexOutOfBoundsException.
  • FIX: Lucene never ends indexing literals.
  • FIX: There are queries with quad patterns that don't seems to use the context index but would evidently benefit from it.
  • FIX: Loading data with ftsIndexPolicy=onCommit occasionally throws an ArrayIndexOutOfBoundsException.
  • FIX: Feedback UI functionality is missing.
  • FIX: Missing some standard namespaces.
  • FIX: During BSBM 10B, fetching namespace by prefix would cause performance issues, if namespaces count is around a hundred thousand or so.
  • FIX: Remove 'headed' consistency check implementation.
  • FIX: Various combinations of INSERT/DELETE/WHERE fail with OWLIM-Enterprise.
  • FIX: JMX management beans are not being closed on shutdown.
  • FIX: Out of sync state reached after rolling back a transaction with a blank node due to a consistency violation.
  • FIX: Checkbox behavior problems on Repository edit page.
  • FIX: Lucene searches stop working after a while.
  • FIX: OWLIM Update feedback missing.
  • FIX: Backup does not work when node is not on the same machine.
  • FIX: sesame:directType not working correctly.
  • FIX: Incorrect counting of statements.
  • FIX: disable-sameAs returns redundant results for { ?s owl:sameAs ?o }.
  • FIX: Parser properties for getting-started are being ignored.
  • FIX: Error creating repository "Expected literal for property base-Url".
  • FIX: Corrupted predicate statistics.
  • FIX: No info message for server restart when Ruleset setting is changed.
  • FIX: On edit repository page Ruleset setting is not remembered.
  • FIX: When adding namespaces to SPARQL --> Query page existing query is removed.
  • FIX: Default namespace cannot be selected.
  • FIX: OWLIM-Workbench doesn't appear to report an expired license.
  • FIX: Lucene alternative Scorer and ScorerFactory are not excluded from obfuscation thus unavailable fro the users.
  • FIX: SPARQL updates should not be influenced by the query-timeout and query-limit-results parameters.
  • FIX: NullPointerException on a query.
  • FIX: Sample queries are being decoded.
  • FIX: Query timeout is not considered when using SPARQL's COUNT() operator but is considered when using SailConnectionImpl.getStatements().
  • FIX: Dialog when exceeding maximum size for import uninformative.
  • FIX: OWLIM-Workbench returns wrong status code for missing media type on POST.
  • FIX: OWLIM-Workbench does not implement HTTP HEAD method (and wrong status code returned).
  • FIX: When creating a new user, make access to the SYSTEM repository default to read.
  • FIX: "FTS Memory" field parse error in create repository page
  • FIX: "Add Location" text field does not escape spaces
  • FIX: Make rule compiler's errors more informative.
  • FIX: Not displayed errors on data import via .ttl files.
  • FIX: Unify SailIterationMonitor's MBean ObjectName to comply to other MBeans ObjectNames.
  • FIX: Postprocess plug-in: flush() disallows entities.put(..., Scope.REQUEST) in read-only mode.
  • FIX: Unicode string literal "\u0000" is not handled correctly when using remote repository.

Version 5.3 (build 6115)

  • FIX: The value of the PendingWrites management bean is now properly decremented when the cluster is not writable
  • FIX: Storage index no longer grows unexpectedly after executing DROP ALL or RepositoryConnection.clear()
  • FIX: Incremental retraction may cause invalid inferred statements within contexts to remain after a deletion
  • FIX: A memory leak may be triggered if the logger level is DEBUG or finer on some of the internal components

Version 5.3 (build 6011)

  • FIX: Inconsistent data-type indexes may lead to invalid or partial query results
  • FIX: Rebuilding predicate lists can lead to an infinite loop
  • FIX: Timeout related exception during Node status check may falsely flag a node as OFF
  • FIX: Replication may end with corrupted image on some Windows environments

Version 5.3 (build 5928)

  • FIX: Reduced contention for parallel queries on the shared configuration data structures
  • FIX: Numerous changes to reduce memory consumption and memory leaks
  • FIX: For apparent corruption of predicate statistics used during query optimisation
  • FIX: For file handle leak when incrementally updating a lucene index when no changes have occurred
  • FIX: For removing a class membership of an instance of a member of owl:intersectionOf set that does not remove the membership to the intersection itself
  • FIX: Incorrect computation in complexity estimation that can lead to suboptimal query plan.
  • FIX: Prevent out of sync state when aborting a transaction due to a consistency violation
  • FIX: Added more checking for detecting a failed update on the probe worker node

Version 5.3 (build 5849)

This maintenance release addresses a number of critical cluster stability issues:

  • FIX: Rejected update causes worker to be flagged as OFF - this could lead to an unexpected deep-replication event in the presence of high query loads
  • FIX: Deep replication can fail to complete properly. This intermittent problem can leave a worker node corrupt and unable to restart
  • FIX: Worker can enter a hung state when trying to shut down for replication
  • FIX: Cluster master initiates multiple deep-replication operations
  • FIX: Consistency violation could cause calling thread to deadlock
  • FIX: SPARQL updates should not be influenced by the query-timeout and query-limit-results parameters

Version 5.3 (build 5777)

  • Improvement: Small transaction logs are processed in memory, so that a series of small updates are processed more quickly.
  • Improvement: Better values for statistics are used for query optimisation for certain statement patterns.
  • FIX: Use of sesame:directType predicate returns extra incorrect matches during query answering.
  • FIX: Query-timeout no longer applies to backup operations.
  • FIX: Occasional NullPointerException when query evaluation exceeds time-out setting.
  • FIX: Performance degradation after a large number of inserts and deletes due to incorrect predicate statistics that affect query optimisation.
  • FIX: Spurious exception trace when re-initialising a Lucene FTS index.
  • FIX: Certain configurations of Lucene FTS index can not be serialised.
  • FIX: A bug prevented owl:sameAs statements from being visible during query answering.
  • FIX: Allow for greater tolerance when checking worker status to avoid false "node off" events.
  • FIX: Online backup fails to execute in some environments.
  • FIX: Second phase of a cluster update is done in one wave.

Version 5.3

This is a maintenance release that includes Sesame 2.6.10 and the following significant updates:

  • Backup and restore methods have been added to the JMX interface. These use the replication function to copy a worker node database image to a directory on the cluster master node for making a backup, and also to copy an image from the master node to all attached workers when restoring from a backup.
  • A new 'remote replication' feature allows two clusters to remain in synch. This can be useful when maintaining a disaster recovery cluster instance, because it allows the master in the remote cluster to appear like a normal worker node in order to receive updates from the master in the main cluster.
  • A performance degradation when loading very large datasets has been fixed. After a few billion statements load performance started to drop and by around 6 billion statements performance was four times slower than it should be. After the fix the data loading speed at 20 billion statements is around 50% slower than with an empty database.
  • It is now possible to put a global limit on the number of query results per query. Any queries that generate more results will have the remainder truncated. This feature can be useful for any public-facing SPARQL endpoints.
  • Consistency checks in OWLIM-SE and OWLIM-Enterprise are now strictly enforced. If consistency checking is enabled, data that causes an inconsistency will not be allowed and an update transaction containing an inconsistency will abort and rollback to the previous database state. For example, if using the OWL2-RL ruleset an attempt to declare an individual as being a member of two disjoint classes will trigger a rollback.

The full set of updates for this release include:

  • New Feature
    • OWLIM-527 - Allow for forced termination of an update transaction
    • OWLIM-999 - Remote replication for OWLIM cluster and online backup/restore
  • Improvement
    • OWLIM-720 - Cluster main node needs to check ruleset on nodes
    • OWLIM-887 - MD5 snapshot is too slow for OWLIM-Enterprise
    • OWLIM-888 - Improve logging of query execution plan and JMX query monitoring
    • OWLIM-902 - Globally limit the number of results for queries
    • OWLIM-908 - Improve query logging.
  • Bug
    • OWLIM-386 - Cluster constantly attempts to resynch when a worker is set up incorrectly
    • OWLIM-559 - Builtin ruleset works differently if used as precompiled "owl-max(-optimized)" and through distribution Builtin_Rules.pie
    • OWLIM-820 - LUBM fails with external ruleset
    • OWLIM-822 - DELETE query with a wildcard predicate takes excessive time
    • OWLIM-856 - Sesame server stops responding after period of use
    • OWLIM-857 - OwlimSchemaRepository and SailImpl do not implement NotifyingSail.
    • OWLIM-858 - DESCRIBE query causes SailConnectionImpl.evaluate() to throw RuntimeException
    • OWLIM-860 - Query-timeout causes out of memory error
    • OWLIM-862 - Lucene lock problem when using full-text search incremental update
    • OWLIM-863 - ASK query matches statements in defaut graph when includeInferred=false
    • OWLIM-864 - Memory leak with simple ASK query
    • OWLIM-865 - Predicate list index causes some query results to be lost
    • OWLIM-880 - Performance degradation after a large number of inserts and deletes
    • OWLIM-883 - Rebuilding context index failed
    • OWLIM-904 - Plug-in 'preprocess()' called twice within single request session
    • OWLIM-911 - Accessing internal identifiers returns same value for all
    • OWLIM-913 - Remove number of pages in POS/PSO from worker signature
    • OWLIM-914 - Differently configured worker is not detected as OUT OF SYNCH
    • OWLIM-917 - Cluster master unstable after removing worker node
    • OWLIM-918 - Cluster does not allow any updates to be processed
  • Task
    • OWLIM-787 - More configuration parameters used to create worker node fingerprints.
    • OWLIM-842 - Stateful replication
    • OWLIM-866 - Add SPARQL update functionality to getting-started
    • OWLIM-876 - Verify optional indices are up-to-date and rebuild if necessary
    • OWLIM-877 - Log full version number at start-up
    • OWLIM-900 - Allow plug-ins to force the rollback of a transaction
    • OWLIM-905 - Force a rollback when a consistency check fails
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.