OWLIM-Enterprise Release Notes

Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version. Compare with Current  |   View Page History

New features and significant bug-fixes/updates for the last few releases are recorded here. Full version numbers are given as:

major.minor.build_number

e.g. 5.3.5928 where the major number is 5, the minor number is 3 and the build number is 5928. Releases with the same major and minor version numbers do not contain any new features, the only difference is that releases with later build numbers contain fixes for bugs discovered since the previous release. New or significantly changed features are released with a higher major or minor version number.

Version 5.6(build 7713)

Improvements:

  • [OWLIM56:LVM-based Backup and Replication] - Backup can optionally be based on the LVM Shadow Volume Copy - which makes it faster and the worker is released a few seconds, after the backup is started (ported from 5.4).
  • [OWLIM56:New Cluster Test (cluster deployment and test tool)] - a tool for automated deployment and testing of clusters of various sizes. Can deploy on AWS and local instances. Supports docker format. Allows for the running of acceptance, stress and load tests on the deployed clusters. Optionally creates Nagios configuration for the deployed cluster
  • LoadRDF tool - a tool for faster bulk loading of data has been merged from 5.5 branch
  • Merged EntityPool Reverse Cache from 5.5 - will speedup larger updates (100+ statements)

Fixes:

  • All AcceptanceTests that were previosly failing are now fixed
    • Improved communication between master and worker nodes with respect to the above tests
    • Worker thread: fixed out-of-sync handling upon init
  • Improved logging and in particular fixed the skip of some stacktraces by the JVM
  • Initialization of 5.6 worker from 5.4 image now skip "entityIdSize" and InferencerCRC from owlim.properties

Version 5.6 beta 3 (build 7659)

Fixes:

  • cluster - empty worker initialization
  • worker - initial update handling
  • log sync: 10s wait between idle rounds (network bandwidth optimization)
  • Tx log: initialization bug fixed
  • update might fail when replication is in progress
  • misc bug fixes in the cluster utils (deployment/status) and proxies restart
  • detailed logging:
    • replication cluster worker events
    • HTTP client stats
    • Tx log initialization

Known issues:

  • AcceptanceTests failing: W4, M4, MW3, MW7, MW8
  • the new/experimental LVM backup/restore feature is not yet ported from 5.4 (and thus MW10 and MW11 Acceptance Tests are not implemented, because they are based on it)

Version 5.6 beta 2 (build 7523)

Fixes:

  • updated AcceptanceTests in the MastersAndWorkers section
  • replication start/wait methods improved
  • several fixes to the TxLog protocol
  • fixed replication logic to delete the Worker repo, only when the remote worker confirms the replication
  • additional sanity checks added to the Master-to-Master and Master-to-Worker synchronization
  • improved logging, incl. "SPLITBRAIN" events logged both to logs and to JMX

Known issues:

  • some MW* tests with the forced replication fail randomly but rarely - related to the Proxy tool

Version 5.6 beta 1 (build 7368)

Ontotext redesigned its cluster architecture to support the case of two or more separate data centres (each with its own Master and Worker nodes) and to provide asynchronous transactions and Master failover. OWLIM Enterprise already supported Master-Worker clusters with Automatic Replication, Load Balancing and Transaction Logs, but in this release these components were improved. Owlim 5.6 is based on 5.5 and inherits its write performance improvements.

  • IMPROVEMENT: [OWLIM56:Client Failover Utility] which can be configured to fallback to the next master if the first master becomes unavailable
  • IMPROVEMENT: Better TransactionLog support (see: [OWLIM56:Transaction Log Improvements]) - the updates are synchronized between all masters in all data centers
  • IMPROVEMENT: All Masters are now Read/Write
  • IMPROVEMENT: [OWLIM56:Smart Replication]
  • IMPROVEMENT: Protocol backward compatibility - the ability to upgrade the OWLIM cluster without downtime, following the OWLIM Upgrade Procedure.
  • IMPROVEMENT: [OWLIM56:External Plug-ins] - the plugins in OWLIM are moved into a separate plugin directory, and now could be upgraded/maintained separately.

Known issues:

  • CONCERN: Transaction consistency concern. In the new cluster, the Master responds to an update from a client as soon as the test node completes it.
    In a single threaded scenario the next query could be evaluated on a node that still has either not received it or not completed it which could lead to inconsistency from the client point of view. This deviates from update processing in 5.4 where the response is created after last of the available nodes complete it[OWLIM-1483]

Version 5.5 (build 7071)

A new cluster version with improved delete and update speeds; some minor bug fixes are merged from 5.4 version. This 5.5 is based on Sesame-2.7.8 - the same version that is in 5.4.

  • IMPROVEMENT: Update speed of small transactions increased on large (500m) datasets. The speed gain is 2x-6x. The impact on smaller datasets is between 50-100% (LDBC-50m). The indexes use different format, which is not backward compatible with OWLIM-5.4, but automatic conversion takes place, when an old image is opened with 5.5. [W-25]
  • IMPROVEMENT: Improved aggregation speed in some specific cases, e.g. SELECT COUNT queries without filters.
  • IMPROVEMENT: Further optimizations in the "optimized rulesets" - rdfs-optimized, owl-horst-optimized, etc. As described [in the documentation], the optimized rule-sets avoid some features in RDFS and OWL specification that result in fairly inefficient inference, without adding value for a wide range of applications. The "optimizations" were further developed in 5.5, by removing the rdfs:range axiom for rdf:type predicate. Tests show ~25% improvement of the update speed on LDBC-50m.
  • IMPROVEMENT: Before 5.5 in many cases, when the schema (ontology) is updated, OWLIM performs very slow. Opitimizations were implemented to resolve this problem; those also resulted in improved speed of deletion when many rdf:type statements are being deleted. [OWLIM-1435]
  • FIX: "Connection reset" in cluster causes Worker nodes to be OUT-OF-SYNC [B-98]
  • FIX: Surrounding FILTER clause with braces changes the result of query [W-26]
  • FIX: Under heavy R/W load when Lucene plugin is used, addToIndex() could cause synchronization problems (AlreadyClosedException inside Lucene) [OWLIM-1442]

Version 5.4 (build 6863)

This is a cluster maintenance version with improved Master-Worker synchronization protocol and improved logging. Sesame was upgraded to 2.7.8.

  • Upgrade to Sesame 2.7.8 (See its release notes here: https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=11300)
  • IMPROVEMENT: Owlim-Ent uses SPARQL over HTTP instead of raw sockets for communication
  • IMPROVEMENT: Logging: Query log, slow query log; monitor & reload logback configuration by default
  • FIX: Drop graph causing statement to be deleted from another graph
  • FIX: Sesame HTTP Client rewritten (part of cluster protocol improvement)
  • FIX: CLEAR GRAPH <graph>. Not removing all statements
  • FIX: Query optimizer takes ~1s for certain queries
  • FIX: owl2-rl/sameAs problem in LDBC
  • FIX: query returns non-existent data. In the case of

    we always return bindings related to 'a' (the others are ignored) due to an issue related to binding propagation outwards the UNION.

Version 5.4 (build 6590)

This is a patch for cluster/replication problem for Owlim-Ent. It also contains a new property for the number of incremental updates allowed and a Sesame Workbench "Explore" improvement. Here is the list:

  • FIX: critical bug within the ReplicationCluster that could cause data loss in case of unsuccessful full replications (OWLIM-1294)
  • FIX: a workaround to WILEY-20 "RepositoryException: attempt to unlock read lock, not locked by current thread"
  • IMPROVEMENT: new MBean attribute exposed to the ReplicationCluster named 'IncrementalUpdateLimit' with default value of 10. It controls the number of updates we could consider so to decide whether to do Incremental vs Full replication (provided the out-of-synch node is in some known state)
  • IMPROVEMENT: The performance of the Sesame Workbench "Explore" feature was improved in the case of missing context indexes (which is the default in Owlim). OWLIM-1318: getContextsIDs may take longer on large datase if context indices are not enabled
  • FIX: minor bugfixes to the experimental 'lucene2' plugin

Version 5.4 (build 6486)

This release provides a number of fixes and improvements over the previous release. Most importantly there are improvements for page allocation and deallocation during transactions, performance improvements for use-cases with huge amounts of DELETE and DROP GRAPH statements, and other improvements on the efficiency of query execution. The OWLIM Enterprise Cluster should be more responsive under huge load, there is a new dedicated HTTPClient connection reserved for system requests (Master Node - Worker Nodes communication) and those system requests do not interfere with the other user query requests. Also write queries are executed across the cluster in two steps only (instead of three steps).

This release is bundled with Sesame 2.7.7 which provides better control over transactions compared to Sesame 2.6. Transactions are now started with a call to begin() rather than implicitly when an update operation is started. If begin() is not used, then the behaviour reverts to what was previously called 'auto-commit', i.e. any update operation is committed immediately. HTTPRepository now supports background parsing/concurrent reading of results. The release notes for the last few versions of Sesame are here:

NOTE: (Sesame 2.7.4 and above) Although this Sesame release is classified as a minor release (indicating compatibility with earlier releases), users of SPARQL are advised of one change that is not backward compatible with earlier releases: support for the '{min, max}' property path length syntax has been removed from the SPARQL parser, in accordance with its earlier removal from the official SPARQL specification. See issue SES-1706 for details. If you are currently using this particular syntax construct in your SPARQL queries, you are advised to modify those queries before upgrading.

The bundled Apache Lucene library is updated to version 3.6.2 in preparation for a new full-text search plug-in that will be published in the near future. Several extensions have been made to OWLIM's plug-in API to support this, as well as to allow closer integration with some text analytics components licensed separately from Ontotext.

The full set of updates for this release include:

  • IMPROVEMENT: Delete performance degradation after deleting large amounts of triples.
  • IMPROVEMENT: Inefficiency in query execution plan and optimiser: IN operator.
  • IMPROVEMENT: Page allocation/deallocation optimisation for transaction control.
  • IMPROVEMENT: Cluster writes are now executed in two steps (instead of three steps).
  • IMPROVEMENT: Create a view to manipulate the Sesame namespaces.
  • IMPROVEMENT: Transaction management for SPARQL updates.
  • IMPROVEMENT: Update configuration parameters.
  • IMPROVEMENT: Multi-threaded BSBM test not use 100% CPU.
  • IMPROVEMENT: Define and implement correct behaviour from OWLIM explicit and implicit graphs.
  • IMPROVEMENT: Add a query string field to the TrackRecord class for the JMX interface.
  • FIX: Out of Sync events but No of triples does not show as -1 - cluster might never sync.
  • FIX: Using luc:addToIndex causes cluster to go out of sync.
  • FIX: Plugin for TopBraid Composer 4.x does not install/initialise properly.
  • FIX: Consistency checks fail.
  • FIX: Problem with "a owl:sameAs b" and "a != b" constraint.
  • FIX: StackOverflow error while evaluating query.
  • FIX: Some strange message is being logged - WARNING: tried to remove unknown connection object from store.
  • FIX: After Sesame 2.7.5 upgrade, DESCRIBE stops working.
  • FIX: Drop GRAPH causing statement to be deleted from another graph.
  • FIX: luc:addToIndex causing StackOverflowError.
  • FIX: Performance of query with statements contained in GRAPH expression.
  • FIX: "Explore" doesn't work for stores with large amounts of data.
  • FIX: Large number of SPARQL filter parameters causes stack trace.
  • FIX: Failing SPARQL query.
  • FIX: XMLSchema#date not valid? Error while handling request (500): Query evaluation error: Malformed query result from server.
  • FIX: Issue with DELETE of a context.
  • FIX: System hangs on clearing the documents.
  • FIX: FTS luc:excludePredicates, includePredicates don't work properly.
  • FIX: Deeper FTS molecule (moleculeSize="3") not collected.
  • FIX: Query fails with "Cannot retrieve literal with ID of #" after previous RTE "Could not flush entity storage".
  • FIX: The built-in FTS crashes with ArrayIndexOutOfBoundsException.
  • FIX: Lucene never ends indexing literals.
  • FIX: There are queries with quad patterns that don't seems to use the context index but would evidently benefit from it.
  • FIX: Loading data with ftsIndexPolicy=onCommit occasionally throws an ArrayIndexOutOfBoundsException.
  • FIX: Feedback UI functionality is missing.
  • FIX: Missing some standard namespaces.
  • FIX: During BSBM 10B, fetching namespace by prefix would cause performance issues, if namespaces count is around a hundred thousand or so.
  • FIX: Remove 'headed' consistency check implementation.
  • FIX: Various combinations of INSERT/DELETE/WHERE fail with OWLIM-Enterprise.
  • FIX: JMX management beans are not being closed on shutdown.
  • FIX: Out of sync state reached after rolling back a transaction with a blank node due to a consistency violation.
  • FIX: Checkbox behavior problems on Repository edit page.
  • FIX: Lucene searches stop working after a while.
  • FIX: OWLIM Update feedback missing.
  • FIX: Backup does not work when node is not on the same machine.
  • FIX: sesame:directType not working correctly.
  • FIX: Incorrect counting of statements.
  • FIX: disable-sameAs returns redundant results for { ?s owl:sameAs ?o }.
  • FIX: Parser properties for getting-started are being ignored.
  • FIX: Error creating repository "Expected literal for property base-Url".
  • FIX: Corrupted predicate statistics.
  • FIX: No info message for server restart when Ruleset setting is changed.
  • FIX: On edit repository page Ruleset setting is not remembered.
  • FIX: When adding namespaces to SPARQL --> Query page existing query is removed.
  • FIX: Default namespace cannot be selected.
  • FIX: OWLIM-Workbench doesn't appear to report an expired license.
  • FIX: Lucene alternative Scorer and ScorerFactory are not excluded from obfuscation thus unavailable fro the users.
  • FIX: SPARQL updates should not be influenced by the query-timeout and query-limit-results parameters.
  • FIX: NullPointerException on a query.
  • FIX: Sample queries are being decoded.
  • FIX: Query timeout is not considered when using SPARQL's COUNT() operator but is considered when using SailConnectionImpl.getStatements().
  • FIX: Dialog when exceeding maximum size for import uninformative.
  • FIX: OWLIM-Workbench returns wrong status code for missing media type on POST.
  • FIX: OWLIM-Workbench does not implement HTTP HEAD method (and wrong status code returned).
  • FIX: When creating a new user, make access to the SYSTEM repository default to read.
  • FIX: "FTS Memory" field parse error in create repository page
  • FIX: "Add Location" text field does not escape spaces
  • FIX: Make rule compiler's errors more informative.
  • FIX: Not displayed errors on data import via .ttl files.
  • FIX: Unify SailIterationMonitor's MBean ObjectName to comply to other MBeans ObjectNames.
  • FIX: Postprocess plug-in: flush() disallows entities.put(..., Scope.REQUEST) in read-only mode.
  • FIX: Unicode string literal "\u0000" is not handled correctly when using remote repository.

Version 5.3 (build 6115)

  • FIX: The value of the PendingWrites management bean is now properly decremented when the cluster is not writable
  • FIX: Storage index no longer grows unexpectedly after executing DROP ALL or RepositoryConnection.clear()
  • FIX: Incremental retraction may cause invalid inferred statements within contexts to remain after a deletion
  • FIX: A memory leak may be triggered if the logger level is DEBUG or finer on some of the internal components

Version 5.3 (build 6011)

  • FIX: Inconsistent data-type indexes may lead to invalid or partial query results
  • FIX: Rebuilding predicate lists can lead to an infinite loop
  • FIX: Timeout related exception during Node status check may falsely flag a node as OFF
  • FIX: Replication may end with corrupted image on some Windows environments

Version 5.3 (build 5928)

  • FIX: Reduced contention for parallel queries on the shared configuration data structures
  • FIX: Numerous changes to reduce memory consumption and memory leaks
  • FIX: For apparent corruption of predicate statistics used during query optimisation
  • FIX: For file handle leak when incrementally updating a lucene index when no changes have occurred
  • FIX: For removing a class membership of an instance of a member of owl:intersectionOf set that does not remove the membership to the intersection itself
  • FIX: Incorrect computation in complexity estimation that can lead to suboptimal query plan.
  • FIX: Prevent out of sync state when aborting a transaction due to a consistency violation
  • FIX: Added more checking for detecting a failed update on the probe worker node

Version 5.3 (build 5849)

This maintenance release addresses a number of critical cluster stability issues:

  • FIX: Rejected update causes worker to be flagged as OFF - this could lead to an unexpected deep-replication event in the presence of high query loads
  • FIX: Deep replication can fail to complete properly. This intermittent problem can leave a worker node corrupt and unable to restart
  • FIX: Worker can enter a hung state when trying to shut down for replication
  • FIX: Cluster master initiates multiple deep-replication operations
  • FIX: Consistency violation could cause calling thread to deadlock
  • FIX: SPARQL updates should not be influenced by the query-timeout and query-limit-results parameters

Version 5.3 (build 5777)

  • Improvement: Small transaction logs are processed in memory, so that a series of small updates are processed more quickly.
  • Improvement: Better values for statistics are used for query optimisation for certain statement patterns.
  • FIX: Use of sesame:directType predicate returns extra incorrect matches during query answering.
  • FIX: Query-timeout no longer applies to backup operations.
  • FIX: Occasional NullPointerException when query evaluation exceeds time-out setting.
  • FIX: Performance degradation after a large number of inserts and deletes due to incorrect predicate statistics that affect query optimisation.
  • FIX: Spurious exception trace when re-initialising a Lucene FTS index.
  • FIX: Certain configurations of Lucene FTS index can not be serialised.
  • FIX: A bug prevented owl:sameAs statements from being visible during query answering.
  • FIX: Allow for greater tolerance when checking worker status to avoid false "node off" events.
  • FIX: Online backup fails to execute in some environments.
  • FIX: Second phase of a cluster update is done in one wave.

Version 5.3

This is a maintenance release that includes Sesame 2.6.10 and the following significant updates:

  • Backup and restore methods have been added to the JMX interface. These use the replication function to copy a worker node database image to a directory on the cluster master node for making a backup, and also to copy an image from the master node to all attached workers when restoring from a backup.
  • A new 'remote replication' feature allows two clusters to remain in synch. This can be useful when maintaining a disaster recovery cluster instance, because it allows the master in the remote cluster to appear like a normal worker node in order to receive updates from the master in the main cluster.
  • A performance degradation when loading very large datasets has been fixed. After a few billion statements load performance started to drop and by around 6 billion statements performance was four times slower than it should be. After the fix the data loading speed at 20 billion statements is around 50% slower than with an empty database.
  • It is now possible to put a global limit on the number of query results per query. Any queries that generate more results will have the remainder truncated. This feature can be useful for any public-facing SPARQL endpoints.
  • Consistency checks in OWLIM-SE and OWLIM-Enterprise are now strictly enforced. If consistency checking is enabled, data that causes an inconsistency will not be allowed and an update transaction containing an inconsistency will abort and rollback to the previous database state. For example, if using the OWL2-RL ruleset an attempt to declare an individual as being a member of two disjoint classes will trigger a rollback.

The full set of updates for this release include:

  • New Feature
    • OWLIM-527 - Allow for forced termination of an update transaction
    • OWLIM-999 - Remote replication for OWLIM cluster and online backup/restore
  • Improvement
    • OWLIM-720 - Cluster main node needs to check ruleset on nodes
    • OWLIM-887 - MD5 snapshot is too slow for OWLIM-Enterprise
    • OWLIM-888 - Improve logging of query execution plan and JMX query monitoring
    • OWLIM-902 - Globally limit the number of results for queries
    • OWLIM-908 - Improve query logging.
  • Bug
    • OWLIM-386 - Cluster constantly attempts to resynch when a worker is set up incorrectly
    • OWLIM-559 - Builtin ruleset works differently if used as precompiled "owl-max(-optimized)" and through distribution Builtin_Rules.pie
    • OWLIM-820 - LUBM fails with external ruleset
    • OWLIM-822 - DELETE query with a wildcard predicate takes excessive time
    • OWLIM-856 - Sesame server stops responding after period of use
    • OWLIM-857 - OwlimSchemaRepository and SailImpl do not implement NotifyingSail.
    • OWLIM-858 - DESCRIBE query causes SailConnectionImpl.evaluate() to throw RuntimeException
    • OWLIM-860 - Query-timeout causes out of memory error
    • OWLIM-862 - Lucene lock problem when using full-text search incremental update
    • OWLIM-863 - ASK query matches statements in defaut graph when includeInferred=false
    • OWLIM-864 - Memory leak with simple ASK query
    • OWLIM-865 - Predicate list index causes some query results to be lost
    • OWLIM-880 - Performance degradation after a large number of inserts and deletes
    • OWLIM-883 - Rebuilding context index failed
    • OWLIM-904 - Plug-in 'preprocess()' called twice within single request session
    • OWLIM-911 - Accessing internal identifiers returns same value for all
    • OWLIM-913 - Remove number of pages in POS/PSO from worker signature
    • OWLIM-914 - Differently configured worker is not detected as OUT OF SYNCH
    • OWLIM-917 - Cluster master unstable after removing worker node
    • OWLIM-918 - Cluster does not allow any updates to be processed
  • Task
    • OWLIM-787 - More configuration parameters used to create worker node fingerprints.
    • OWLIM-842 - Stateful replication
    • OWLIM-866 - Add SPARQL update functionality to getting-started
    • OWLIM-876 - Verify optional indices are up-to-date and rebuild if necessary
    • OWLIM-877 - Log full version number at start-up
    • OWLIM-900 - Allow plug-ins to force the rollback of a transaction
    • OWLIM-905 - Force a rollback when a consistency check fails

Version 5.2 (build 5563)

  • Fix to prevent the query optimiser choosing a sub-optimal query plan after a long sequence of insert and delete modifications. Fragmentation of storage pages was causing errors in the complexity computations.
  • Fix to prevent concurrent modification exceptions when namespaces are being updated.

Version 5.2 (build 5512)

  • Fix to prevent a memory leak due to connection references kept by the PluginManager. This also can cause a performance degradation over time.
  • Fix to dataset management that was causing explicit triples from the default (nameless) graph to be included as input to query execution when the query uses FROM or FROM NAMED and the includeInferred parameter is set to false.

Version 5.2 (build 5497)

  • Fix to prevent org.apache.lucene.store.LockObtainFailedException when incrementally updating a Lucene index. The index's configuration was being incorrectly serialised causing unpredictable behaviour.
  • Fix for missing query results when the optional predicate-lists index is switched on. With this index enabled, statements with certain predicates were being ignored.

Version 5.2 (build 5479)

  • Fix for an out of memory error that can be caused when using the query-timeout parameter.

Version 5.2 (build 5421)

  • Update to fire a JMX notification when a worker node is low on disk space

Version 5.2 (build 5331)

  • Fix the known problem that prevents custom rule files being compiled when using Java 1.7
  • Fix to avoid the stack overflow problem when optimising certain SPARQL queries that use the MINUS operator.

Version 5.2 (build 5316)

This is a maintenance release that includes Sesame 2.6.8 (change log 2.6.7 change log 2.6.8). Note that 2.6.7 is NOT backward compatible with 2.6.6 due to a couple of minor changes to interfaces. The following significant updates have been made:

  • A number of resilience improvements to cluster management
    • Better handling of out of disk space problems
    • Better communication between workers in all modes of operation
    • New JMX operation to cancel replication
  • Support for the N-Quads RDF format
  • Changes to the Plug-in SDK
    • Add transaction begin/end information to Statements.Listener interface
    • Allow for pre-processor plug-ins to modify the query inside their request
    • StatementIterator has new methods for testing read-only, explicit and implicit status
  • Improvements to the getting-started application to allow it to load very large RDF files without the need to break them in to smaller pieces
  • Improved locking (less contention) in the entity pool and when using RDF Rank
  • Cache/index statistic now always collected over JMX (attribute to switch this on/off has been removed)

Known problems:

  • Custom rule-sets will not compile when using Java 1.7 - this will be fixed in the near future with an interim update

The full set of updates for this release includes:

  • Bug
    • OWLIM-359 - Support different file formats in "imports" parameter
    • OWLIM-495 - Blank node contexts ignored by getStatements()
    • OWLIM-696 - Context parameter ignored when reading statements using HTTP protocol
    • OWLIM-767 - Improve thread synchronisation for RDF rank plug-in
    • OWLIM-776 - Cluster failure due to replicating to a worker that is out of disk space
    • OWLIM-782 - INSERT update hangs and consumes huge amount of memory
    • OWLIM-784 - Poor query performance due to query optimisation problems when using the Sesame's QueryJoinOptimizer.
    • OWLIM-804 - OWLIM-SE runs a query with an optional and a property path 10 times slower than OWLIM-Lite.
    • OWLIM-807 - Using Predicate Lists has an adverse effect on query performance - incorrect complexity estimate
    • OWLIM-813 - Slow deletion of statements using DELETE WHERE
    • OWLIM-815 - Lucene FTS functional tests failing when executed against remote repository
    • OWLIM-825 - Query timeout does not apply on certain queries
  • Improvement
    • OWLIM-697 - Add support for the NQuads RDF format
    • OWLIM-590 - Improve efficiency of RDFRank recomputation
    • OWLIM-706 - Collect and export statistics over JMX unconditionally
    • OWLIM-777 - Worker nodes should respond appropriately on system status check in any state
  • New Feature
    • OWLIM-582 - Allow for Preprocessor plug-ins to modify the query inside their request
    • OWLIM-774 - Create a read-only system named graph that will be used in the UniversalConverter to separate the schema statements from the other explicit statements.
  • Task
    • OWLIM-496 - Axiomatic statements should behave as inferred statements during query answering
    • OWLIM-788 - Add transaction information to plug-in SDK callback interface
    • OWLIM-796 - Add more methods to plug-in SDK StatementIterator to test for explicit and implicit attributes
    • OWLIM-816 - Share BufferPool instances that manage ByteBuffers of the same size
    • OWLIM-830 - Improve loading behaviour of getting started to handle huge RDF files

Version 5.1 (build 5208)

  • Fix the known problem in build 5183. An incompatibility between OWLIM and Sesame query optimisation (QueryJoinOptimizer) causes poor performance in certain circumstances. The use of QueryJoinOptimizer has been restricted to sub-select optimisation only.

Version 5.1 (build 5183)

This is a maintenance release that includes includes Sesame 2.6.6 and many fixes from interim releases made since 5.0. Repositories created with version 5.0 are binary compatible with 5.1, i.e. the OWLIM software can be updated and used with existing storage files created with version 5.0.

The following improvements have been made:

  • Axiomatic statements now treated as inferred statements during query answering
  • License file can now be set using an environment variable: OWLIM_LICENSE_FILE
  • Username and password parameters added to GettingStarted when using remote repositories with HTTP authentication
  • This release includes Sesame 2.6.6 - please see the change log for this release.

The following bugs have been addressed:

  • Full replication fails when using different platform specific local pathnames.
  • Read-only (imported) statements loose their read-only status when migrating from previous versions
  • Increased memory demand in OWLIM due to delayed finalization
  • Plugin API's statement modification methods (Statements.put/delete) don't allow modifications
  • Fixed an integer overflow bug in the compression module that results in exception whenever the overlay file grows beyond 2G.
  • Fixed a bug that can cause big query time increase when using plugins. It occurs when using the plugin triple patterns in combination with an ordinary one which has a very large collection size and is placed before the plugin triple pattern.
  • Fixed a bug that can cause incorrect query optimisation and/or a NullPointerException at query time. This problem can occur when estimating the number of matching triples for patterns containing a predicate for which there are no asserted statements in the repository.
  • Workaround to avoid unnecessary BottomUpJoinIteration(sub-selects intersection) when a sub-select is joined with an ordinary statement pattern or a join of such.
  • System statements filtered out from getContectIDs()
  • Resolved memory leaks when Updates are mixed with queries involving unbound predicate variables. That cause all unused Indexes to be kept locked.
  • Apply external bindings prior to handle query optimization. Speeds up such queries by avoiding having filters in 'AfterOptionals'
  • Collected Namespaces were not properly persisted on Windows
  • Added transactional handling of changes in properties file - fingerprints, namespaces, geometry, etc..
  • Rolled back transactions do not close transaction log files in a timely manner, leading to "too many open files" error when many rollbacks occur in sequence. Clean-up code has been relocated to ensure that it is called immediately.
  • Equivalence class updates do not close temporary files in a timely manner, leading to "too many open files" error when many transactions containing owl:sameAs statements are committed in sequence. Temporary files now removed immediately after the transaction completes.
  • Rebuild of predicate lists fails on Windows - the old files were locked and not deleted.
  • Repository lockfile is not released after failed initialisation. The new behaviour is to remove the lockfile if initialisation has failed and the lock file did not exist at the start of initialisation.
  • Schema updates should not allow removal of inferred statements.
  • Suboptimal query plan when using geospatial index
  • System contexts (from ternary relations in rule files) are visible to the Sesame workbench when browsing contexts
  • Namespaces lost when instance terminated

Known problems

The improvements in both Sesame and OWLIM for better optimisation of sub-queries have unfortunately caused a regression in query performance when using property paths with optional elements. This issue is being urgently addressed and a fix will be available very soon via an interim 5.1 release (later build number).

The problem affects queries of this form, e.g.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.com#>
SELECT * WHERE {
  ex:PersonA foaf:knows/foaf:knows?/foaf:name ?name .
}

If you experience this problem, it might be possible to re-write such queries using a UNION, e.g.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.com#>
SELECT * WHERE {
  { ex:PersonA foaf:knows/foaf:name ?name }
  UNION
  { ex:PersonA foaf:knows/foaf:knows/foaf:name ?name }
  # with more UNIONs for longer paths as necessary
}

Version 5.0

  • This version of OWLIM-Enterprise is not backwardly compatible with any previous version. This means that images created with OWLIM 4.3 and before will not work correctly with OWLIM 5.0 and must be re-created. There have been a great many modifications to the storage files, indexing structures, etc, and upgrade mechanisms have proven too complex and probably slower than re-loading the database anyway. Please do not attempt to upgrade to OWLIM 5.0 unless you drop and recreate all databases.
  • Transaction management and isolation mechanisms have been completely refactored. The previous strategy used very lazy writing of modified database pages, such that dirty pages were only flushed to disk when further updates occur and no more memory is available. While extremely fast, the problem with this approach is that there is a considerable recovery time associated with replaying the transaction log after an abnormal termination. The new mechanism uses two modes: 'bulk-loading' (fast) with similar behaviour to previous versions and 'normal' (safe) where database modifications are flushed to disk as part of the commit operation. When running in safe mode, database recovery is instant and there is a significant improvement in concurrency between updates and queries. Some related changes are:
    • There is a new parameter to control the transaction 'mode' called transaction-mode - see the configuration section
    • The database-recovery-policy configuration parameter is no longer required and has been removed
    • The special flush predicate http://www.ontotext.com/flush used to force pages to be written to disk is no longer required and has been removed. Statements using this predicate will be treated like any other statement.
    • The special reinfer predicate http://www.ontotext.com/owlim/system#reinfer used to force a re-computation of all inferences has been removed. Statements using this predicate will be treated like any other statement.
    • In fast transaction mode, the isolation constraint can be relaxed in order to improve concurrency behaviour when strict read isolation is not a requirement - this is controlled by a new parameter transaction-isolation that only has an effect in fast mode, see the configuration section
    • No recovery mechanisms are in place when running in fast mode - therefore administrators must treat an abnormal termination during bulk-loading as a fatal event and must restart the loading procedure
  • New context indices can be used to improve query performance when data is modeled using many named graphs. These are switched on and off using a single configuration parameter enable-context-index - see the configuration section
  • The SPARQL 1.1 Graph Store HTTP Protocol is now supported according to the W3C Working Draft from the 12th May 2011. This provides a REST interface for managing collections of graphs, using either directly or indirectly named graphs.
  • Sesame 2.6.5 with many bug-fixes and updates to bring SPARQL 1.1 Query support up to the latest W3C Working Draft from the 5th January 2012.
  • Significant reduction in disk-space requirements is achieved with the following modifications:
    • Index compression can now be used to reduce disk storage requirements by using zip compression on database pages. This feature if off by default, but can be switched on when creating a new repository. The configuration parameter index-compression-ratio can be set to -1 (the default value indicating no compression) or a value in the range [10-50] indicating the desired percentage reduction in page sizes. Any pages that can not be compressed by the specified amount are stored uncompressed. Therefore a compression ratio that is too aggressive will not bring many benefits. Experiments have shown that for large datasets a value of about 30% is close to optimal.
    • Restructuring of the triple indices has also led to a reduction in disk-space requirements of around 18% independent of the compression functionality
    • Entity compression is a modification that reduces the storage requirements for the lookup table that maps between internal identifiers and resources. This is transparent to the user and happens automatically. More disk space reductions are apparent using this version.
  • A new literal index is created automatically for numeric and date/time data-types. The index is used during query evaluation only if a query or a subquery (e.g. union) has a filter that is comprised of a conjunction of literal constraints, e.g. FILTER(?x >= 3 && ?y <= 5 && ?start > "2001-01-01"^^xsd:date). Other patterns, including those that use negation, will not use the index for this version of OWLIM.
  • All control queries now use SPARQL Update syntax (used mostly to control the Lucene-based full-text search, RDF Rank and geo-spatial plug-ins). This has a number of advantages, namely:
    • No special control query pseduo-graph is required by the Replication Cluster master in order to identify control queries that must be pushed to all worker nodes
    • SPARQL Updates use the corresponding SPARQL update protocol, so they can be automatically processed by load-balancers that examine URL patterns
    • It is more consistent with the SPARQL language, since these 'control queries' cause a change of state in OWLIM
  • Incremental Lucene-based full-text search index for updating the index for specific resources or all un-indexed resources. Using this technique can avoid the more expensive approach of rebuilding the whole index frequently.
  • Incremental RDF Rank allows the RDF rank for specific resources to be (re-)computed as directed by the user. This technique can avoid the more expensive approach of rebuilding all RDF Rank values frequently.
  • The Geo-spatial index has been updated to support 40-bit resource identifiers.
  • The getting started application has been restructured so that it now works with remote repositories.
  • OWLIM-Enterprise also includes the following maintenance updates and fixes:
    • Bugs
      • OWLIM-703 Getting started in OWLIM-Enterprise distribution zip has incorrect sail type in owlim.ttl
      • OWLIM-669 Sesame workbench no longer provides OWLIM options when creating a new repository
      • OWLIM-624 Entering the license file in Sesame Workbench requires backslashes to be escaped (twice)
    • New features
      • OWLIM-671 Port the custom "Update Timeout" functionality into 4.3 and 5.0 branches
      • OWLIM-636 Improved software license validation
    • Other
      • OWLIM-629 Make Getting Started more robust
      • OWLIM-522 Refactor LUBM test drivers
      • OWLIM-519 Drop Java 1.5 compatibility (change to Java 1.6)
      • OWLIM-498 Format and clean-up entire OWLIM code-base using eclipse formatter
      • OWLIM-422 Reformulate control queries to use SPARQL 1.1 Update syntax

Known problems with OWLIM 5.0 BETA 3

  • The behaviour of the 'include inferred' checkbox in the Sesame Workbench is unpredictable when using OWLIM repositories.

Version 4.3

Further contributions to the Sesame framework from Ontotext and Fluid Operations mean that Sesame version 2.6 is included with this version of OWLIM. The following new features are available:

  • SPARQL 1.1 Federation support that allows queries to pull together data from any number of distributed SPARQL endpoints
  • A new SPARQL repository type to wrap SPARQL endpoints
  • Improvements to the parser for controlling the level of literal/data-type validation and the handling of errors
  • Many other fixes for compliance with the latest revised SPARQL 1.1 working drafts

OWLIM has now has a plug-in API that allows users to build software components that alter the behaviour of OWLIM. This mechanism can be used to add new features or to improve performance in certain scenarios.

OWLIM also includes the following maintenance updates and fixes:

  • OWLIM-205 - Validate literal languages and do not allow invalid language tags to enter the repository
  • OWLIM-273 - Potential thread leak in QueryModelConverter
  • OWLIM-390 - Counting statements using Sesame API gives strange results.
  • OWLIM-419 - Make RepositoryConnection.exportStatements obey the time limit
  • OWLIM-426 - Unable to permanently remove predefined namespace definitions
  • OWLIM-428 - Explicit axioms don't show up as explicit if they have been inferred before by other axioms
  • OWLIM-463 - Clear transaction log in replication cluster if it cannot be initialized
  • OWLIM-466 - SesameConnectionImpl.getStatements must return quads, not trips (breaks workbench explore)
  • OWLIM-470 - Query with Union and optional returns wrong results
  • OWLIM-471 - Can not access new repository when FTS switched on (divide by zero or lockfile locked)
  • OWLIM-473 - onto:explicit pseudo-graph does not prevent implicit statements as input for query answering
  • OWLIM-475 - Repackaged console.sh in openrdf-console.zip has lost its execute attribute
  • OWLIM-476 - Neither of the slf4j jars (api or jdk14) are needed in the war files
  • OWLIM-483 - Lost solutions to queries with FROM <...> clause
  • OWLIM-485 - Repository with many transactions fails to get restored
  • OWLIM-488 - Incorrect behaviour of FROM and FROM NAMED in SPARQL queries
  • OWLIM-489 - Predicate list indices do not log statistics
  • OWLIM-490 - User-supplied Dataset object on query not properly handled
  • OWLIM-491 - Query rewriting in MainQuery.convertToOptimizedForm() converts OR to AND in filters when converting the condition to disjunctive normal form
  • OWLIM-495 - Blank node contexts ignored by getStatements()
  • OWLIM-501 - Lucene and OPTIONAL query bug
  • OWLIM-502 - The database restorer deletes the pso and pos files after second unsuccessful restore
  • OWLIM-457 - Validate data-type values at load time
  • OWLIM-497 - Update getting-started and add timestamps
  • OWLIM-356 - Optimized rule set is not compatible with the rule compiler.
  • OWLIM-480 - Make use of the com.ontotext.trree.collections for the predicate map in order to reuse the file header and the common interface

Version 4.2

Ontotext have continued to invest in the Sesame project and are pleased to announce the inclusion of Sesame version 2.5 with this version of OWLIM. The benefits include:

  • SPARQL 1.1 Update - this extension of SPARQL provides a much more powerful method to modify RDF databases without the requirement for developers to use frameworks and APIs.
  • SPARQL 1.1 Query conformance has been updated to the May 2011 working draft, i.e. all the remaining behaviour has been implemented along with all the new SPARQL filter functions.
  • The SPARQL protocol has also been updated to January 2010 working draft.
  • A new binary RDF serialization format. This format has been derived from the existing binary tuple results format. It's main features are reduced parsing overhead and minimal memory requirements.

As well as integration with the new Sesame APIs and modifications for optimising SPARQL Update, there have also been a number of bug fixes in this version of OWLIM-Enterprise:

  • OWLIM-396 - A RuntimeException is thrown in clearNamespaces() in SailConnection
  • OWLIM-404 - HashEntityPool fails to store/read its entity index table if its size is more than ~500M
  • OWLIM-408 - Getting of default namespace doesn't work
  • OWLIM-440 - Can not create geo-spatial index when using OWLIM-SE with Tomcat
  • OWLIM-443 - Repository fails to start - entity pool error
  • OWLIM-445 - disable-sameAs causing query evaluation to lose bindings
  • OWLIM-446 - Query.setIncludeInferred() is ignored
  • OWLIM-447 - License file can not be specfied - default evaluation license is always used.
  • OWLIM-449 - Wrong conversion from int to long in com.ontotext.trree.plugin.lucene.LuceneIterator
  • OWLIM-452 - Multiple wrong results are returned for a CONSTRUCT query
  • OWLIM-454 - EntityStorageVersion3 fails to restore if a long entity has negative size.
  • OWLIM-455 - Cannot put any more statements in AVL tree after ~3.1B statements added during 3.5-to-4.0 conversion
  • OWLIM-305 - Rationalise OWLIM vocabulary

Version 4.1

This maintenance release includes Sesame 2.4.2, which fixes several important bugs in SPARQL 1.1 Query support:

Also included are some updates to OWLIM-SE:

  • Unexpected binding returned in a Sparql query with union within an optional expression
  • FILTER in OPTIONAL patterns returns incorrect results
  • Aggregate SPARQL query fails with IndexOutOfBoundsException
  • Default and named graphs set in a SPARQL query are ignored by the Jena connector

Version 4.0

  • OWLIM Replication Cluster has been renamed to OWLIM-Enterprise and is distributed separately from OWLIM-SE. This new name better identifies this software component as the flagship product of the OWLIM family suitable for mission critical applications.
  • Easy to deploy WAR files: The distribution now includes openrdf-sesame and openrdf-workbench Web applications pre-configured with OWLIM and ready to deploy. This makes installing OWLIM as a server and creating/administrating OWLIM repositories trivially simple. The WAR files can be found in the sesame_owlim directory of the distribution ZIP file. See 'easy install' in the [installation section].
  • SPARQL 1.1 Query: Ontotext has invested significant development resources in the Sesame project in order to bring SPARQL 1.1 support to all editions of OWLIM. Since OWLIM-Enterprise is a distributed architecture based on OWLIM-SE, OWLIM-Enterprise also includes SPARQL 1.1 Query, but without federation support for the moment. SPARQL 1.1 Update support will be included in the next release. The new features include:
    • Aggregates
    • Subqueries
    • Negation
    • Expressions in the SELECT clause
    • Property Paths
    • Assignment
    • A short form for CONSTRUCT
    • An expanded set of functions and operators
  • The SPARQL 1.1 specification has not yet become a W3C recommendation and continues to evolve. The following known issues apply to this release of OWLIM and Sesame:
    • fn:concat is not supported. This was added to the working draft in May, just after the Sesame 2.4.0 release was finalised. It will likely be included in the next Sesame/OWLIM release.
    • Empty IN() and NOT IN() clauses will cause an exception - will be fixed in the next release.
    • Using the aggregate function SUM() will cause an exception if the there are no bindings over which to do the summation - will be fixed in the next release.
    • Federation is not yet supported. This will be implemented in a later version of Sesame and OWLIM later this year.
    • There are some problems with complex expressions in the SELECT clause. This should be fixed in the next release of Sesame/OWLIM.

Version 3.5

This release includes many bug fixes, several new features and updates:

  • Write-only worker node: When worker nodes are added to the cluster via the JMX interface, they can be specified as being 'write-only'. These nodes will be kept in synch with the rest of the cluster, but will not take part in answering cluster queries. The motivation for this feature is to have one or more worker nodes available for batch processing of queries that do not affect the overall query performance of the cluster.
  • Remote notifications: A new mechanism to complement the existing high-performance 'in-process' notification mechanism. This new mechanism allows clients to subscribe for the given statement patterns to OWLIM Replication Cluster master nodes.
  • Online documentation: As well as the PDF format user guides included in the OWLIM distribution zip files, the latest documentation for all editions of OWLIM is now available online.

Version 3.4

  • Replication cluster introduced in this version of OWLIM: brings resilience, failover and horizontally scalable parallel query processing. A master node component is included that can manage a cluster of worker nodes (standard BigOWLIM instances) to synchronise updates, cater for node failure, dynamically add/remove worker nodes and distributed query requests. Such a setup allows for massive concurrent query performance where the number of queries processed per second scales almost linearly with the number of worker nodes
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.