GraphDB-Enterprise Release Notes

compared with
Current by Nikola Petrov
on Apr 09, 2015 17:47.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (164)

View Page History
Starting from version 6, GraphDB now includes three separate products with their own version numbers: GraphDB Engine, GraphDB Workbench and GraphDB Connectors (experimental). New features and significant bug-fixes/updates for the last few releases are recorded here. Full Each product's full version numbers are given as:

{panel}major.minor.build_number{panel}

e.g. 5.3.5928 where the major number is 5, the minor number is 3 and the build number is 5928. Releases with the same major and minor version numbers do not contain any new features, the only difference is that releases with later build numbers contain fixes for bugs discovered since the previous release. New or significantly changed features are released with a higher major or minor version number.
e.g. 5.3.5928 where the major number is 5, the minor number is 3 and the build number is 5928.
The integrated releases have their own version, e.g. 6.0-RC1.

h1. Version 5.6(build 7713)
Releases with the same major and minor version numbers do not contain any new features. The only difference is that releases with later build numbers contain fixes for bugs discovered since the previous release. New or significantly changed features are released with a higher major or minor version number.

h2. Improvements:
* [OWLIM56:LVM-based Backup and Replication] - Backup can optionally be based on the LVM Shadow Volume Copy - which makes it faster and the worker is released a few seconds, after the backup is started (ported from 5.4).
* [OWLIM56:New Cluster Test (cluster deployment and test tool)] - a tool for automated deployment and testing of clusters of various sizes. Can deploy on AWS and local instances. Supports docker format. Allows for the running of acceptance, stress and load tests on the deployed clusters. Optionally creates Nagios configuration for the deployed cluster
* LoadRDF tool - a tool for faster bulk loading of data has been merged from 5.5 branch
* Merged EntityPool Reverse Cache from 5.5 - will speedup larger updates (100+ statements)
{toc}

h2. Fixes:
* All AcceptanceTests that were previosly failing are now fixed
** Improved communication between master and worker nodes with respect to the above tests
** Worker thread: fixed out-of-sync handling upon init
* Improved logging and in particular fixed the skip of some stacktraces by the JVM
* Initialization of 5.6 worker from 5.4 image now skip "entityIdSize" and InferencerCRC from owlim.properties
h1. GraphDB 6.1-SP3
This Service Pack 3 build addresses some reported problems with 6.1

h1. Version 5.6 beta 3 (build 7659)
Fixes:
* cluster - empty worker initialization
* worker - initial update handling
* log sync: 10s wait between idle rounds (network bandwidth optimization)
* Tx log: initialization bug fixed
* update might fail when replication is in progress
* misc bug fixes in the cluster utils (deployment/status) and proxies restart
* detailed logging:
** replication cluster worker events
** HTTP client stats
** Tx log initialization
h2. GraphDB Engine 6.1-b4371143

Known issues:
* AcceptanceTests failing: W4, M4, MW3, MW7, MW8
* the new/experimental LVM backup/restore feature is not yet ported from 5.4 (and thus MW10 and MW11 Acceptance Tests are not implemented, because they are based on it)
* OWLIM-1888 Better inference performance for datasets with big number of classes and onProperty restrictions
* Fixed a problem where queries containing UNION and BIND were under certain circumstances returning incorrect results
* New merge command for the storage tool
* OWLIM-1932 Count from graph http://www.ontotext.com/count does not work for describe queries
* OWLIM-1934 Removed some old and unwanted namespaces that were included by default
* OWLIM-1928 Added reversed numeric and date literal indices for better performance
* OWLIM-1990 Worker data is deleted when a plugin fails to initialize
* Using store-backed queue to keep transaction operations on the master - this should prevent out of memory errors when importing huge files

h1. Version 5.6 beta 2 (build 7523)
Fixes:
- updated AcceptanceTests in the MastersAndWorkers section
- replication start/wait methods improved
- several fixes to the TxLog protocol
- fixed replication logic to delete the Worker repo, only when the remote worker confirms the replication
- additional sanity checks added to the Master-to-Master and Master-to-Worker synchronization
- improved logging, incl. "SPLITBRAIN" events logged both to logs and to JMX

Known issues:
- some MW* tests with the forced replication fail randomly but rarely - related to the Proxy tool
h2. GraphDB Workbench 6.4.1
- See all the release notes here: [https://confluence.ontotext.com/display/GraphDB6/GraphDB-Workbench+Release+Notes]

h1. Version 5.6 beta 1 (build 7368)
Ontotext redesigned its cluster architecture to support the case of two or more separate data centres (each with its own Master and Worker nodes) and to provide asynchronous transactions and Master failover. OWLIM Enterprise already supported Master-Worker clusters with Automatic Replication, Load Balancing and Transaction Logs, but in this release these components were improved. Owlim 5.6 is based on 5.5 and inherits its write performance improvements.
- IMPROVEMENT: [OWLIM56:Client Failover Utility] which can be configured to fallback to the next master if the first master becomes unavailable
- IMPROVEMENT: Better TransactionLog support (see: [OWLIM56:Transaction Log Improvements]) - the updates are synchronized between all masters in all data centers
- IMPROVEMENT: All Masters are now Read/Write
- IMPROVEMENT: [OWLIM56:Smart Replication]
- IMPROVEMENT: Protocol backward compatibility - the ability to upgrade the OWLIM cluster without downtime, following the OWLIM Upgrade Procedure.
- IMPROVEMENT: [OWLIM56:External Plug-ins] - the plugins in OWLIM are moved into a separate plugin directory, and now could be upgraded/maintained separately.

h2. Known issues:
- CONCERN: Transaction consistency concern. In the new cluster, the Master responds to an update from a client as soon as the test node completes it.
In a single threaded scenario the next query could be evaluated on a node that still has either not received it or not completed it which could lead to inconsistency from the client point of view. This deviates from update processing in 5.4 where the response is created after last of the available nodes complete it\[OWLIM-1483\]
h2. GraphDB Connectors 3.1.4
- See all the release notes here: [https://confluence.ontotext.com/display/GraphDB6/GraphDB+Connectors+Release+notes]

h1. Version 5.5 (build 7071)
A new cluster version with improved delete and update speeds; some minor bug fixes are merged from 5.4 version. This 5.5 is based on Sesame-2.7.8 - the same version that is in 5.4.
- IMPROVEMENT: Update speed of small transactions increased on large (500m) datasets. The speed gain is 2x-6x. The impact on smaller datasets is between 50-100% (LDBC-50m). The indexes use different format, which is not backward compatible with OWLIM-5.4, but automatic conversion takes place, when an old image is opened with 5.5. \[W-25\]
- IMPROVEMENT: Improved aggregation speed in some specific cases, e.g. SELECT COUNT queries without filters.
- IMPROVEMENT: Further optimizations in the "optimized rulesets" - rdfs-optimized, owl-horst-optimized, etc. As described [in the documentation|OWLIM55:OWLIM-SE Reasoner#Performance Optimizations in RDFS and OWL Support], the optimized rule-sets avoid some features in RDFS and OWL specification that result in fairly inefficient inference, without adding value for a wide range of applications. The "optimizations" were further developed in 5.5, by removing the rdfs:range axiom for rdf:type predicate. Tests show \~25% improvement of the update speed on LDBC-50m.
- IMPROVEMENT: Before 5.5 in many cases, when the schema (ontology) is updated, OWLIM performs very slow. Opitimizations were implemented to resolve this problem; those also resulted in improved speed of deletion when many rdf:type statements are being deleted. \[OWLIM-1435\]
- FIX: "Connection reset" in cluster causes Worker nodes to be OUT-OF-SYNC \[B-98\]
- FIX: Surrounding FILTER clause with braces changes the result of query \[W-26\]
- FIX: Under heavy R/W load when Lucene plugin is used, addToIndex() could cause synchronization problems (AlreadyClosedException inside Lucene) \[OWLIM-1442\]

h1. Version 5.4 (build 6863)
This is a cluster maintenance version with improved Master-Worker synchronization protocol and improved logging. Sesame was upgraded to 2.7.8.
h1. GraphDB 6.1-SP1
This Service Pack 1 build addresses consistency issue with the GraphDB Storage. The cluster's backup/restore functionality is also fixed.

- Upgrade to Sesame 2.7.8 (See its release notes here: https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=11300)
- IMPROVEMENT: Owlim-Ent uses SPARQL over HTTP instead of raw sockets for communication
- IMPROVEMENT: Logging: Query log, slow query log; monitor & reload logback configuration by default
- FIX: Drop graph causing statement to be deleted from another graph
- FIX: Sesame HTTP Client rewritten (part of cluster protocol improvement)
- FIX: CLEAR GRAPH <graph>. Not removing all statements
- FIX: Query optimizer takes ~1s for certain queries
- FIX: owl2-rl/sameAs problem in LDBC
- FIX: query returns non-existent data. In the case of
{code}
OPTIONAL { { a } UNION { b } UNION { c } } or OPTIONAL { FILTER(a || b || c) }
{code}
we always return bindings related to 'a' (the others are ignored) due to an issue related to binding propagation outwards the UNION.
h2. GraphDB 6.1.8410
- extended new cluster backup/restore to support multiple named backups. Important note: the backup now uses a name instead of an absolute folder
- fixes for the master URL parameter (autodetected in most cases, thus no need to specify manually)
- minor cluster improvements (shutdown on OOM during sync; worker: handling multiple initializations and no reporting of missing fingerprints on init; fixed race condition in replication; new internal URIs (not URLs) for replication)
- fixed: [owlim-1853] LoadRDF under Windows sometimes does not include statements in the PSO index and the indexes are incosistent;
- Added a check/warning to see if there is enough memory for the entity pool when it gets restored from persistence (-Xmx - cache-memory should be >= entity pool size * 1.25, i.e. 25% overhead is left for the datatype index and other in-memory structures included in the entity pool and for other purposes, e.g. for running queries)
- Constraint validation and support for multiple rulesets is out of its experimental stage and now in production. See the docs here: [https://confluence.ontotext.com/display/GraphDB6/GraphDB+Constraint+Validation]

h2. GraphDB Workbench 6.3.2
- See all the release notes here: [https://confluence.ontotext.com/display/GraphDB6/GraphDB-Workbench+Release+Notes]

h1. Version 5.4 (build 6590)
This is a patch for cluster/replication problem for Owlim-Ent. It also contains a new property for the number of incremental updates allowed and a Sesame Workbench "Explore" improvement. Here is the list:

- FIX: critical bug within the ReplicationCluster that could cause data loss in case of unsuccessful full replications (OWLIM-1294)
- FIX: a workaround to WILEY-20 "RepositoryException: attempt to unlock read lock, not locked by current thread"
- IMPROVEMENT: new MBean attribute exposed to the ReplicationCluster named 'IncrementalUpdateLimit' with default value of 10. It controls the number of updates we could consider so to decide whether to do Incremental vs Full replication (provided the out-of-synch node is in some known state)
- IMPROVEMENT: The performance of the Sesame Workbench "Explore" feature was improved in the case of missing context indexes (which is the default in Owlim). OWLIM-1318: getContextsIDs may take longer on large datase if context indices are not enabled
- FIX: minor bugfixes to the experimental 'lucene2' plugin
h1. GraphDB - 6.1.8316
This is an Enterprise release only.

h1. Version 5.4 (build 6486)
h2. Improvements (replication cluster):
- Tx Log stability improvements & fixes - scrapped optimistic join procedure; scrapped shared horizon & completed queue; fixed status patching base
- Improved logging detail
- Updates are dispatched to the other peers immediately once accepted and enqueued
- dedicated peer dispatch threads
- fixed split-brain recovery
- Client API: Changed the master retry policy (do not check every master too often; but still make sure that when previously unavailable master becomes available we detect that)

This release provides a number of fixes and improvements over the previous release. Most importantly there are improvements for page allocation and deallocation during transactions, performance improvements for use-cases with huge amounts of DELETE and DROP GRAPH statements, and other improvements on the efficiency of query execution. The OWLIM Enterprise Cluster should be more responsive under huge load, there is a new dedicated HTTPClient connection reserved for system requests (Master Node - Worker Nodes communication) and those system requests do not interfere with the other user query requests. Also write queries are executed across the cluster in two steps only (instead of three steps).
h2. Replication improvements:
- cleanup target directory on accepting replication data
- 60s timeout in replication client
- limited retries in the replication server
- replication client startup delay patch
- retry serving replication data on failure
- exposed exceptions thrown from the replication server

This release is bundled with [Sesame|http://www.openrdf.org] 2.7.7 which provides better control over transactions compared to Sesame 2.6. Transactions are now started with a call to begin() rather than implicitly when an update operation is started. If begin() is not used, then the behaviour reverts to what was previously called 'auto-commit', i.e. any update operation is committed immediately. HTTPRepository now supports background parsing/concurrent reading of results. The release notes for the last few versions of Sesame are here:
* [Sesame 2.7.7|https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=11200]
* [Sesame 2.7.6|https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=11100]
* [Sesame 2.7.5|https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=11002]
* [Sesame 2.7.4|https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=11000]
* [Sesame 2.7.3|https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=10900]
* [Sesame 2.7.2|https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=10800]
* [Sesame 2.7.1|https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=10500]
* [Sesame 2.7.0|https://openrdf.atlassian.net/secure/ReleaseNote.jspa?projectId=10000&version=10060]
h2. Other improvements:
- The inferencer debug statistics were not displayed at shutdown because we used the SwitchableInferencer's ones instead of those in currentInferencer that did the job
- Created StorageTool app, used to Scan/Rebuild indexes in case of missing statements

*NOTE:* (Sesame 2.7.4 and above) Although this Sesame release is classified as a minor release (indicating compatibility with earlier releases), users of SPARQL are advised of one change that is not backward compatible with earlier releases: support for the '\{min, max\}' property path length syntax has been removed from the SPARQL parser, in accordance with its earlier removal from the official SPARQL specification. See issue [SES-1706|https://openrdf.atlassian.net/browse/SES-1706] for details. If you are currently using this particular syntax construct in your SPARQL queries, you are advised to modify those queries before upgrading.
h3. Fixes:
- [OWLIM-1838] Custom NTriples/NQuads parser: ArrayOutOfBoundsException can be thrown on empty lines while parsing and then NPE while trying to process t.getMessage() which may be null
- [OWLIM-1834] Fixed: LoadRDF Tool doesn't work with GraphDB Enterprise
- [W-44] Introduced a parameter 'in-clause-max-members' (defaults to 16) which limits the number of entities specified in the IN clause. If the IN clause contains more elements then it won't be optimized to a UNION query
- [W-55] Fixed the ASC/DESC issue in ORDER BY clause when the variable in the ORDER BY is in an OPTIONAL clause (the issue was different number of results returned in ASC/DESC cases).

The bundled Apache Lucene library is updated to version 3.6.2 in preparation for a new full-text search plug-in that will be published in the near future. Several extensions have been made to OWLIM's plug-in API to support this, as well as to allow closer integration with some [text analytics components|http://www.ontotext.com/kim/text-analysis] licensed separately from Ontotext.

The full set of updates for this release include:
h1. GraphDB version 6.1-RC1

* IMPROVEMENT: Delete performance degradation after deleting large amounts of triples.
* IMPROVEMENT: Inefficiency in query execution plan and optimiser: IN operator.
* IMPROVEMENT: Page allocation/deallocation optimisation for transaction control.
* IMPROVEMENT: Cluster writes are now executed in two steps (instead of three steps).
* IMPROVEMENT: Create a view to manipulate the Sesame namespaces.
* IMPROVEMENT: Transaction management for SPARQL updates.
* IMPROVEMENT: Update configuration parameters.
* IMPROVEMENT: Multi-threaded BSBM test not use 100% CPU.
* IMPROVEMENT: Define and implement correct behaviour from OWLIM explicit and implicit graphs.
* IMPROVEMENT: Add a query string field to the TrackRecord class for the JMX interface.
* FIX: Out of Sync events but No of triples does not show as -1 - cluster might never sync.
* FIX: Using luc:addToIndex causes cluster to go out of sync.
* FIX: Plugin for TopBraid Composer 4.x does not install/initialise properly.
* FIX: Consistency checks fail.
* FIX: Problem with "a owl:sameAs b" and "a != b" constraint.
* FIX: StackOverflow error while evaluating query.
* FIX: Some strange message is being logged - WARNING: tried to remove unknown connection object from store.
* FIX: After Sesame 2.7.5 upgrade, DESCRIBE stops working.
* FIX: Drop GRAPH causing statement to be deleted from another graph.
* FIX: luc:addToIndex causing StackOverflowError.
* FIX: Performance of query with statements contained in GRAPH expression.
* FIX: "Explore" doesn't work for stores with large amounts of data.
* FIX: Large number of SPARQL filter parameters causes stack trace.
* FIX: Failing SPARQL query.
* FIX: XMLSchema#date not valid? Error while handling request (500): Query evaluation error: Malformed query result from server.
* FIX: Issue with DELETE of a context.
* FIX: System hangs on clearing the documents.
* FIX: FTS luc:excludePredicates, includePredicates don't work properly.
* FIX: Deeper FTS molecule (moleculeSize="3") not collected.
* FIX: Query fails with "Cannot retrieve literal with ID of #" after previous RTE "Could not flush entity storage".
* FIX: The built-in FTS crashes with ArrayIndexOutOfBoundsException.
* FIX: Lucene never ends indexing literals.
* FIX: There are queries with quad patterns that don't seems to use the context index but would evidently benefit from it.
* FIX: Loading data with ftsIndexPolicy=onCommit occasionally throws an ArrayIndexOutOfBoundsException.
* FIX: Feedback UI functionality is missing.
* FIX: Missing some standard namespaces.
* FIX: During BSBM 10B, fetching namespace by prefix would cause performance issues, if namespaces count is around a hundred thousand or so.
* FIX: Remove 'headed' consistency check implementation.
* FIX: Various combinations of INSERT/DELETE/WHERE fail with OWLIM-Enterprise.
* FIX: JMX management beans are not being closed on shutdown.
* FIX: Out of sync state reached after rolling back a transaction with a blank node due to a consistency violation.
* FIX: Checkbox behavior problems on Repository edit page.
* FIX: Lucene searches stop working after a while.
* FIX: OWLIM Update feedback missing.
* FIX: Backup does not work when node is not on the same machine.
* FIX: sesame:directType not working correctly.
* FIX: Incorrect counting of statements.
* FIX: disable-sameAs returns redundant results for \{ ?s owl:sameAs ?o \}.
* FIX: Parser properties for getting-started are being ignored.
* FIX: Error creating repository "Expected literal for property base-Url".
* FIX: Corrupted predicate statistics.
* FIX: No info message for server restart when Ruleset setting is changed.
* FIX: On edit repository page Ruleset setting is not remembered.
* FIX: When adding namespaces to SPARQL --> Query page existing query is removed.
* FIX: Default namespace cannot be selected.
* FIX: OWLIM-Workbench doesn't appear to report an expired license.
* FIX: Lucene alternative Scorer and ScorerFactory are not excluded from obfuscation thus unavailable fro the users.
* FIX: SPARQL updates should not be influenced by the query-timeout and query-limit-results parameters.
* FIX: NullPointerException on a query.
* FIX: Sample queries are being decoded.
* FIX: Query timeout is not considered when using SPARQL's COUNT() operator but is considered when using SailConnectionImpl.getStatements().
* FIX: Dialog when exceeding maximum size for import uninformative.
* FIX: OWLIM-Workbench returns wrong status code for missing media type on POST.
* FIX: OWLIM-Workbench does not implement HTTP HEAD method (and wrong status code returned).
* FIX: When creating a new user, make access to the SYSTEM repository default to read.
* FIX: "FTS Memory" field parse error in create repository page
* FIX: "Add Location" text field does not escape spaces
* FIX: Make rule compiler's errors more informative.
* FIX: Not displayed errors on data import via .ttl files.
* FIX: Unify SailIterationMonitor's MBean ObjectName to comply to other MBeans ObjectNames.
* FIX: Postprocess plug-in: flush() disallows entities.put(..., Scope.REQUEST) in read-only mode.
* FIX: Unicode string literal "\u0000" is not handled correctly when using remote repository.
Highlights:

h1. Version 5.3 (build 6115)
- *Various stability fixes to the cluster*, including proper master shutdown sequence error handling, error-resilient synchronization threads, safe saving of configuration properties of the cluster config. The repository fingerprint now also reflects the number of statements and there is better handling of stress events which happen during transactions & dirty shutdowns (e.g. out of disk space)

* FIX: The value of the PendingWrites management bean is now properly decremented when the cluster is not writable
* FIX: Storage index no longer grows unexpectedly after executing DROP ALL or RepositoryConnection.clear()
* FIX: Incremental retraction may cause invalid inferred statements within contexts to remain after a deletion
* FIX: A memory leak may be triggered if the logger level is DEBUG or finer on some of the internal components
- *Much faster write transactions* for small insert/update/delete operations on large repositories. Results on LDBC [Semantic Publishing Benchmark|http://ldbcouncil.org/developer/spb] (SPB) at 50M went up from 32 read and 12 write queries per second in ver. 6.0 to 40 reads/s and 31 writes/s in ver. 6.1. The improvement gets even more visible and SPB at 1B scale: from 10 reads/s and 2 writes/s in ver. 6.0 to 11 reads/s and 10 writes/s in ver. 6.1. In summary, GraphDB 6.1 is able to handle twice more updates at 50M scale and 5 times more updates at scale of 1 billion statements. This way GraphDB 6.1 is already capable to deal with true Dynamic Semantic Publishing scenario, like the one of [BBC|http://www.bbc.co.uk/blogs/legacy/bbcinternet/2012/04/sports_dynamic_semantic.html], at a scale of 1 billion statements and higher.
See more: http://www.ontotext.com/graphdb-benchmark-results/

h1. Version 5.3 (build 6011)
- *Improved load capabilities for large new datasets in live databases* instances. Scenario description: there is a production cluster with average load - running LDBC-50m (SPB) with 4 reading threads (doing select queries) and 1 writing thread (doing update queries). We need to add a large dataset (e.g. DBPedia) with hundreds of millions statements -- as fast as possible, but without disrupting the overall cluster speed too much (not introducing write latency of more than 1-2s). The data set doesn't need inference, so it is loaded with the empty rule set.
Our implementation introduces a new "magic" statement (u, u, u), where u=<[http://www.ontotext.com/useParallelInsertion]>. If this statement is inserted in the beggining of the transaction, then the data will be loaded almost twice faster (it reuses parts of the load-chain from LoadRDF tool) and also the engine will temporarily switch the ruleset to 'emtpy'. We found that splitting the data set into 50k chunks is a good compromise between high loading speed and lower latency increase of parallel updates.

* FIX: Inconsistent data-type indexes may lead to invalid or partial query results
* FIX: Rebuilding predicate lists can lead to an infinite loop
* FIX: Timeout related exception during Node status check may falsely flag a node as OFF
* FIX: Replication may end with corrupted image on some Windows environments
- Small *improvements in the bulk loading tools (LoadRDF)*. It is possible to load different files into different contexts now, as well as provide Statements programmatically to it. See the page for the LoadRDF for the details. [GraphDB-SE LoadRDF tool|GraphDB6:GraphDB-SE LoadRDF tool].&nbsp;These improvements, combined with the increased update speed, allow us to load the English part of DBPedia 2014 (566M statements), in less than an hour at speed of 179 905 statements/sec.

h1. Version 5.3 (build 5928)
- Improvements in *GraphDB Workbench*: The focus with this release was on *security* (users, roles) as well as small stability and usability improvements.

* FIX: Reduced contention for parallel queries on the shared configuration data structures
* FIX: Numerous changes to reduce memory consumption and memory leaks
* FIX: For apparent corruption of predicate statistics used during query optimisation
* FIX: For file handle leak when incrementally updating a lucene index when no changes have occurred
* FIX: For removing a class membership of an instance of a member of owl:intersectionOf set that does not remove the membership to the intersection itself
* FIX: Incorrect computation in complexity estimation that can lead to suboptimal query plan.
* FIX: Prevent out of sync state when aborting a transaction due to a consistency violation
* FIX: Added more checking for detecting a failed update on the probe worker node
This is an integrated release that includes:
- GraphDB Engine 6.0.8274
- GraphDB Workbench 6.2.3
- GraphDB Connectors 3.1.0

h1. Version 5.3 (build 5849)
h1. GraphDB version 6.0-RC6

This maintenance release addresses a number of critical cluster stability issues:
This is an integrated release that includes:
- GraphDB Engine 6.0.8120
- GraphDB Workbench 6.2.2
- GraphDB Connectors 3.1.0

* FIX: Rejected update causes worker to be flagged as OFF - this could lead to an unexpected deep-replication event in the presence of high query loads
* FIX: Deep replication can fail to complete properly. This intermittent problem can leave a worker node corrupt and unable to restart
* FIX: Worker can enter a hung state when trying to shut down for replication
* FIX: Cluster master initiates multiple deep-replication operations
* FIX: Consistency violation could cause calling thread to deadlock
* FIX: SPARQL updates should not be influenced by the query-timeout and query-limit-results parameters
h2. GraphDB Engine 6.0.8120

h1. Version 5.3 (build 5777)
The focus on this release was further improvement of the updates speed on larger transactions (10K\+ statements) as well as overall stability.

* Improvement: Small transaction logs are processed in memory, so that a series of small updates are processed more quickly.
* Improvement: Better values for statistics are used for query optimisation for certain statement patterns.
* FIX: Use of sesame:directType predicate returns extra incorrect matches during query answering.
* FIX: Query-timeout no longer applies to backup operations.
* FIX: Occasional NullPointerException when query evaluation exceeds time-out setting.
* FIX: Performance degradation after a large number of inserts and deletes due to incorrect predicate statistics that affect query optimisation.
* FIX: Spurious exception trace when re-initialising a Lucene FTS index.
* FIX: Certain configurations of Lucene FTS index can not be serialised.
* FIX: A bug prevented owl:sameAs statements from being visible during query answering.
* FIX: Allow for greater tolerance when checking worker status to avoid false "node off" events.
* FIX: Online backup fails to execute in some environments.
* FIX: Second phase of a cluster update is done in one wave.
Improvements:
- The transactions are serialized to JSON instead of XML. JSON Streaming parser is used - this minimizes the memory footprint of the master nodes
- improved handling of long transactions
- bigger transactions are compressed via GZip in the transaction log and when communicated between masters and workers
- GZipped updates can be sent via curl; add "Content-Encoding: gzip" header to use them
- Rule files can now be specified via HTTP and other protocols instead of local files

h1. Version 5.3
Fixes:
- Fixed: OWLIM-1610 OOM on the Master with large update (600k statements)
- Fixed JMX update statistics
- Removed the obsolete Remote Master flag in the replication cluster
- Master backup folder moved over under sesame data folder
- Fixed Lucene plugin to support Custom Analyzers/Scorers (this was broken on 6.0 releases due to plugin classloader not loading jars from the plugin directory)
- [OWLIM-1730|OWLIM-1730] Fixed handling of a failed initialisation within LiteralsPlugin
- [OWLIM-1712|OWLIM-1712] Query Optimizer does not apply 'strong' equality within FILTER when bot variables are used as subject of some statement patterns
- [OWLIM-1615|OWLIM-1615] Fixed how running queries are handled on shutdown

This is a maintenance release that includes [Sesame 2.6.10|http://www.openrdf.org/news.jsp#sesame-2.6.10] and the following significant updates:
h2. GraphDB Workbench

* Backup and restore methods have been added to the JMX interface. These use the replication function to copy a worker node database image to a directory on the cluster master node for making a backup, and also to copy an image from the master node to all attached workers when restoring from a backup.
* A new 'remote replication' feature allows two clusters to remain in synch. This can be useful when maintaining a disaster recovery cluster instance, because it allows the master in the remote cluster to appear like a normal worker node in order to receive updates from the master in the main cluster.
* A performance degradation when loading very large datasets has been fixed. After a few billion statements load performance started to drop and by around 6 billion statements performance was four times slower than it should be. After the fix the data loading speed at 20 billion statements is around 50% slower than with an empty database.
* It is now possible to put a global limit on the number of query results per query. Any queries that generate more results will have the remainder truncated. This feature can be useful for any public-facing SPARQL endpoints.
* Consistency checks in OWLIM-SE and OWLIM-Enterprise are now strictly enforced. If consistency checking is enabled, data that causes an inconsistency will not be allowed and an update transaction containing an inconsistency will abort and rollback to the previous database state. For example, if using the OWL2-RL ruleset an attempt to declare an individual as being a member of two disjoint classes will trigger a rollback.
- the full list of changes in the latest version is available in this page: [GraphDB-Workbench Release Notes|GraphDB6:GraphDB-Workbench Release Notes]

The full set of updates for this release include:
h2. GraphDB Connectors (experimental)

* New Feature
** OWLIM-527 - Allow for forced termination of an update transaction
** OWLIM-999 - Remote replication for OWLIM cluster and online backup/restore
- the full list of changes in the latest version is available in this page: [GraphDB6:GraphDB Connectors Release notes]

* Improvement
** OWLIM-720 - Cluster main node needs to check ruleset on nodes
** OWLIM-887 - MD5 snapshot is too slow for OWLIM-Enterprise
** OWLIM-888 - Improve logging of query execution plan and JMX query monitoring
** OWLIM-902 - Globally limit the number of results for queries
** OWLIM-908 - Improve query logging.

* Bug
** OWLIM-386 - Cluster constantly attempts to resynch when a worker is set up incorrectly
** OWLIM-559 - Builtin ruleset works differently if used as precompiled "owl-max(-optimized)" and through distribution Builtin_Rules.pie
** OWLIM-820 - LUBM fails with external ruleset
** OWLIM-822 - DELETE query with a wildcard predicate takes excessive time
** OWLIM-856 - Sesame server stops responding after period of use
** OWLIM-857 - OwlimSchemaRepository and SailImpl do not implement NotifyingSail.
** OWLIM-858 - DESCRIBE query causes SailConnectionImpl.evaluate() to throw RuntimeException
** OWLIM-860 - Query-timeout causes out of memory error
** OWLIM-862 - Lucene lock problem when using full-text search incremental update
** OWLIM-863 - ASK query matches statements in defaut graph when includeInferred=false
** OWLIM-864 - Memory leak with simple ASK query
** OWLIM-865 - Predicate list index causes some query results to be lost
** OWLIM-880 - Performance degradation after a large number of inserts and deletes
** OWLIM-883 - Rebuilding context index failed
** OWLIM-904 - Plug-in 'preprocess()' called twice within single request session
** OWLIM-911 - Accessing internal identifiers returns same value for all
** OWLIM-913 - Remove number of pages in POS/PSO from worker signature
** OWLIM-914 - Differently configured worker is not detected as OUT OF SYNCH
** OWLIM-917 - Cluster master unstable after removing worker node
** OWLIM-918 - Cluster does not allow any updates to be processed
h1. GraphDB version 6.0-RC5

* Task
** OWLIM-787 - More configuration parameters used to create worker node fingerprints.
** OWLIM-842 - Stateful replication
** OWLIM-866 - Add SPARQL update functionality to getting-started
** OWLIM-876 - Verify optional indices are up-to-date and rebuild if necessary
** OWLIM-877 - Log full version number at start-up
** OWLIM-900 - Allow plug-ins to force the rollback of a transaction
** OWLIM-905 - Force a rollback when a consistency check fails
This is an integrated release that includes:
- GraphDB Engine 6.0.8070
- GraphDB Workbench 6.0.2
- GraphDB Connectors 3.0.0.RC2

h1. Version 5.2 (build 5563)
h2. GraphDB Engine 6.0.8070

* Fix to prevent the query optimiser choosing a sub-optimal query plan after a long sequence of insert and delete modifications. Fragmentation of storage pages was causing errors in the complexity computations.
* Fix to prevent concurrent modification exceptions when namespaces are being updated.

h1. Version 5.2 (build 5512)
h3. Major changes:

* Fix to prevent a memory leak due to connection references kept by the PluginManager. This also can cause a performance degradation over time.
* Fix to dataset management that was causing explicit triples from the default (nameless) graph to be included as input to query execution when the query uses FROM or FROM NAMED and the includeInferred parameter is set to false.
- Improvements to the HA Cluster wrt New Cluster Tests: improved intra-cluster communications, worker initialization, status reporting, improved diagnostics and logging;
- Query monitoring via JMX - the full text of the query is now visible
- Fixes for the Constraint Violation support & multiple rulesets
- Faster update speeds
-- Now using GraphDB Custom NTriples/NQuads parser by default (so NTriples, NQuad formats are parsed faster than other formats)
-- when a transaction is using the *empty* ruleset, the commit can added to all indexes in parallel. In order to use this experiment feature, add the special system statement: \_:b <[http://owlim.ontotext.com/owlim/useParallel]> \_:b in the beginning of the transacton. This makes sense for larger transactions (10K statements and above).

h1. Version 5.2 (build 5497)
h3. Full list of changes:

* Fix to prevent org.apache.lucene.store.LockObtainFailedException when incrementally updating a Lucene index. The index's configuration was being incorrectly serialised causing unpredictable behaviour.
* Fix for missing query results when the optional predicate-lists index is switched on. With this index enabled, statements with certain predicates were being ignored.
- OWLIM-1628 Added a fix of the issue of not being able to explore a ruleset when the empty ruleset was set initially.
- OWLIM-1626 RepositoryException in Worker is not thrown by the Master
- OWLIM-1600 Query returns no results when using FILTER and BIND(if(...)) in it.
- T-10 Implemented automatic entity pool restore procedure which can recover a truncated entity pool and removes the statements from the repo using the IDs beyond the new entity pool size
- OWLIM-1603 Owlim crashes with lock error without obvious reasons (there is no other process that might have locked the repo).
- OWLIM-1592 Queries with at least one sub-select which intersect with an ordinary block of statement patterns perform poorly because of multiple clones and transforms of the Sesame's query model to Owlim's one.
- OWLIM-1593 Fixed bug in MainQuery.clone() (when using Subselect and there are OPTIONALs)
- OWLIM-1572 Query Monitoring - show query text instead of query id
- OWLIM-1559 Fixed property path bug when same property paths are repeated in the query
- F-320 JMX: NumberOfExplicitTriples and NumberOfTriples shows \-1 even though data has been written to the triple store
- OWLIM-1563 Fixed the issue with custom ruleset + disable-sameAs=true.
- OWLIM-1559 Implemented a shortcut in the MINUS operator which allows for faster calculation when the MINUS is over two subqueries with one triple pattern (which may have filters).

h1. Version 5.2 (build 5479)
h2. GraphDB Workbench

* Fix for an out of memory error that can be caused when using the query-timeout parameter.
- the full list of changes in the latest version is available in this page: [GraphDB-Workbench Release Notes|GraphDB6:GraphDB-Workbench Release Notes]

h1. Version 5.2 (build 5421)
h2. GraphDB Connectors (experimental)

* Update to fire a JMX notification when a worker node is low on disk space
- the full list of changes in the latest version is available in this page: [GraphDB6:GraphDB Connectors Release notes]

h1. Version 5.2 (build 5331)
h1. GraphDB version 6.0-RC4

* Fix the known problem that prevents custom rule files being compiled when using Java 1.7
* Fix to avoid the stack overflow problem when optimising certain SPARQL queries that use the MINUS operator.
This is an integrated release that includes:
- GraphDB Engine 6.0.7914
- GraphDB Workbench 6.0.1
- GraphDB Connectors 3.0.0.RC2


h1. Version 5.2 (build 5316)
h1. GraphDB version 6.0-RC3 (build 7914)

This is a maintenance release that includes [Sesame 2.6.8|http://www.openrdf.org/news.jsp#sesame-2.6.8] ([change log 2.6.7|http://www.openrdf.org/issues/secure/ReleaseNote.jspa?projectId=10000&styleName=Html&version=10720] [change log 2.6.8|http://www.openrdf.org/issues/secure/ReleaseNote.jspa?projectId=10000&styleName=Html&version=10740]). Note that 2.6.7 is NOT backward compatible with 2.6.6 due to a couple of minor changes to interfaces. The following significant updates have been made:
h2. Fixes:

* A number of resilience improvements to cluster management
** Better handling of out of disk space problems
** Better communication between workers in all modes of operation
** New JMX operation to cancel replication
* Support for the N-Quads RDF format
* Changes to the Plug-in SDK
** Add transaction begin/end information to Statements.Listener interface
** Allow for pre-processor plug-ins to modify the query inside their request
** StatementIterator has new methods for testing read-only, explicit and implicit status
* Improvements to the getting-started application to allow it to load very large RDF files without the need to break them in to smaller pieces
* Improved locking (less contention) in the entity pool and when using RDF Rank
* Cache/index statistic now always collected over JMX (attribute to switch this on/off has been removed)
* The plugins were moved to <webapps>/openrdf-sesame/WEB-INF/classes/plugins;
* Running GraphDB under embedded Tomcat failed with NPE (because of non existing webapps/ folder).

Known problems:
h1. GraphDB version 6.0-RC2 (build 7892)

* Custom rule-sets will not compile when using Java 1.7 - this will be fixed in the near future with an interim update
h2. Improvements:

The full set of updates for this release includes:
* Added mini LDBC Semantic Publishing Benchmark ([http://ldbc.eu]) into benchmark/ldbc-spb folder in the distribution;
* The plugins are now in <webapps>/openrdf-sesame/plugins folder. Lucene plugin is enabled by default. This could be overwritten by the \-Dregister-external-plugins option;
* Minor rearrangement of the files in the main distribution folder (all .pie files are put into rules/ subfolder, the scripts into scripts/ subfolder).

* Bug
** OWLIM-359 - Support different file formats in "imports" parameter
** OWLIM-495 - Blank node contexts ignored by getStatements()
** OWLIM-696 - Context parameter ignored when reading statements using HTTP protocol
** OWLIM-767 - Improve thread synchronisation for RDF rank plug-in
** OWLIM-776 - Cluster failure due to replicating to a worker that is out of disk space
** OWLIM-782 - INSERT update hangs and consumes huge amount of memory
** OWLIM-784 - Poor query performance due to query optimisation problems when using the Sesame's QueryJoinOptimizer.
** OWLIM-804 - OWLIM-SE runs a query with an optional and a property path 10 times slower than OWLIM-Lite.
** OWLIM-807 - Using Predicate Lists has an adverse effect on query performance - incorrect complexity estimate
** OWLIM-813 - Slow deletion of statements using DELETE WHERE
** OWLIM-815 - Lucene FTS functional tests failing when executed against remote repository
** OWLIM-825 - Query timeout does not apply on certain queries
* Improvement
** OWLIM-697 - Add support for the NQuads RDF format
** OWLIM-590 - Improve efficiency of RDFRank recomputation
** OWLIM-706 - Collect and export statistics over JMX unconditionally
** OWLIM-777 - Worker nodes should respond appropriately on system status check in any state
* New Feature
** OWLIM-582 - Allow for Preprocessor plug-ins to modify the query inside their request
** OWLIM-774 - Create a read-only system named graph that will be used in the UniversalConverter to separate the schema statements from the other explicit statements.
* Task
** OWLIM-496 - Axiomatic statements should behave as inferred statements during query answering
** OWLIM-788 - Add transaction information to plug-in SDK callback interface
** OWLIM-796 - Add more methods to plug-in SDK StatementIterator to test for explicit and implicit attributes
** OWLIM-816 - Share BufferPool instances that manage ByteBuffers of the same size
** OWLIM-830 - Improve loading behaviour of getting started to handle huge RDF files
h2. Fixes:

h1. Version 5.1 (build 5208)
* Fixed issue with the default/evaluation license;
* Fixed issue with the LoadRDF tool.

* Fix the known problem in build 5183. An incompatibility between OWLIM and Sesame query optimisation (QueryJoinOptimizer) causes poor performance in certain circumstances. The use of QueryJoinOptimizer has been restricted to sub-select optimisation only.

h1. Version 5.1 (build 5183)
h1. GraphDB version 6.0 (build 7784)

This is a maintenance release that includes includes Sesame 2.6.6 and many fixes from interim releases made since 5.0. Repositories created with version 5.0 are binary compatible with 5.1, i.e. the OWLIM software can be updated and used with existing storage files created with version 5.0.
GraphDB 6.0 is a re-branded Owlim 5.6 version. The differences are given in the last stable Owlim 5.4 release.

The following improvements have been made:
* Axiomatic statements now treated as inferred statements during query answering
* License file can now be set using an environment variable: OWLIM_LICENSE_FILE
* Username and password parameters added to GettingStarted when using remote repositories with HTTP authentication
* This release includes [Sesame 2.6.6|http://www.openrdf.org/news.jsp#sesame-2.6.6] - please see the [change log|http://www.openrdf.org/issues/secure/ReleaseNote.jspa?projectId=10000&styleName=Html&version=10710] for this release.
h2. Improvements:

The following bugs have been addressed:
* Full replication fails when using different platform specific local pathnames.
* Read-only (imported) statements loose their read-only status when migrating from previous versions
* Increased memory demand in OWLIM due to delayed finalization
* Plugin API's statement modification methods (Statements.put/delete) don't allow modifications
* Fixed an integer overflow bug in the compression module that results in exception whenever the overlay file grows beyond 2G.
* Fixed a bug that can cause big query time increase when using plugins. It occurs when using the plugin triple patterns in combination with an ordinary one which has a very large collection size and is placed before the plugin triple pattern.
* Fixed a bug that can cause incorrect query optimisation and/or a NullPointerException at query time. This problem can occur when estimating the number of matching triples for patterns containing a predicate for which there are no asserted statements in the repository.
* Workaround to avoid unnecessary BottomUpJoinIteration(sub-selects intersection) when a sub-select is joined with an ordinary statement pattern or a join of such.
* System statements filtered out from getContectIDs()
* Resolved memory leaks when Updates are mixed with queries involving unbound predicate variables. That cause all unused Indexes to be kept locked.
* Apply external bindings prior to handle query optimization. Speeds up such queries by avoiding having filters in 'AfterOptionals'
* Collected Namespaces were not properly persisted on Windows
* Added transactional handling of changes in properties file - fingerprints, namespaces, geometry, etc..
* Rolled back transactions do not close transaction log files in a timely manner, leading to "too many open files" error when many rollbacks occur in sequence. Clean-up code has been relocated to ensure that it is called immediately.
* Equivalence class updates do not close temporary files in a timely manner, leading to "too many open files" error when many transactions containing owl:sameAs statements are committed in sequence. Temporary files now removed immediately after the transaction completes.
* Rebuild of predicate lists fails on Windows - the old files were locked and not deleted.
* Repository lockfile is not released after failed initialisation. The new behaviour is to remove the lockfile if initialisation has failed and the lock file did not exist at the start of initialisation.
* Schema updates should not allow removal of inferred statements.
* Suboptimal query plan when using geospatial index
* System contexts (from ternary relations in rule files) are visible to the Sesame workbench when browsing contexts
* Namespaces lost when instance terminated
* High Availability Cluster;
* Fast writes in SAFE Mode (OWLIM 5.5 improvement, which lead to incompatible binary formats between 5.4 and 5.5+);
* LoadRDF tool for faster bulk loading of data; speeds \~100KSt/s and above, without inference;
* Explain Plan like functionality;
* LVM-based Backup and Replication.

*Known problems*
h2. Fixes:

The improvements in both Sesame and OWLIM for better optimisation of sub-queries have unfortunately caused a regression in query performance when using property paths with optional elements. This issue is being urgently addressed and a fix will be available very soon via an interim 5.1 release (later build number).
* Databases created with one setting of the "entity-id-size" parameter (32 vs 40-bit) and opened with another setting, would crash in versions prior to 6.0. Now an exception is thrown and the repository is not initialized.

The problem affects queries of this form, e.g.

{noformat}
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.com#>
SELECT * WHERE {
ex:PersonA foaf:knows/foaf:knows?/foaf:name ?name .
}
{noformat}

If you experience this problem, it might be possible to re-write such queries using a UNION, e.g.
h1. Version 5.6 (build 7713)

{noformat}
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ex: <http://example.com#>
SELECT * WHERE {
{ ex:PersonA foaf:knows/foaf:name ?name }
UNION
{ ex:PersonA foaf:knows/foaf:knows/foaf:name ?name }
# with more UNIONs for longer paths as necessary
}
{noformat}
h2. Improvements:

h1. Version 5.0
* [LVM-based Backup and Replication|OWLIM56:LVM-based Backup and Replication] \- Backup can optionally be based on the LVM Shadow Volume Copy - which makes it faster and the worker is released a few seconds after the backup is started (ported from 5.4).
* [New Cluster Test (cluster deployment and test tool)|OWLIM56:New Cluster Test (cluster deployment and test tool)] \- a tool for automated deployment and testing of clusters of various sizes. Can deploy on AWS and local instances. Supports docker format. Allows for acceptance, stress and load tests to be run on the deployed clusters. Optionally, creates Nagios configuration for the deployed cluster.
* LoadRDF tool - a tool for a faster bulk loading of data, which has been merged from 5.5 branch.
* Merged EntityPool Reverse Cache from 5.5 - speeds up large updates (100\+ statements).

* This version of OWLIM-Enterprise is *not backwardly compatible* with any previous version. This means that images created with OWLIM 4.3 and before will not work correctly with OWLIM 5.0 and must be re-created. There have been a great many modifications to the storage files, indexing structures, etc, and upgrade mechanisms have proven too complex and probably slower than re-loading the database anyway. Please *do not attempt to upgrade to OWLIM 5.0 unless you drop and recreate all databases*.
h2. Fixes:

* *Transaction management and isolation mechanisms* have been completely refactored. The previous strategy used very lazy writing of modified database pages, such that dirty pages were only flushed to disk when further updates occur and no more memory is available. While extremely fast, the problem with this approach is that there is a considerable recovery time associated with replaying the transaction log after an abnormal termination. The new mechanism uses two modes: 'bulk-loading' (fast) with similar behaviour to previous versions and 'normal' (safe) where database modifications are flushed to disk as part of the commit operation. When running in safe mode, *database recovery is instant* and there is a *significant improvement in concurrency between updates and queries*. Some related changes are:
** There is a new parameter to control the transaction 'mode' called {{transaction-mode}} - see the [configuration section|GraphDB-SE Configuration]
** The {{database-recovery-policy}} configuration parameter is no longer required and has been removed
** The special flush predicate {{http://www.ontotext.com/flush}} used to force pages to be written to disk is no longer required and has been removed. Statements using this predicate will be treated like any other statement.
** The special reinfer predicate {{http://www.ontotext.com/owlim/system#reinfer}} used to force a re-computation of all inferences has been removed. Statements using this predicate will be treated like any other statement.
** In fast transaction mode, the isolation constraint can be relaxed in order to improve concurrency behaviour when strict read isolation is not a requirement - this is controlled by a new parameter {{transaction-isolation}} that only has an effect in fast mode, see the [configuration section|GraphDB-SE Configuration]
** No recovery mechanisms are in place when running in fast mode - therefore administrators must treat an abnormal termination during bulk-loading as a fatal event and must restart the loading procedure
* All AcceptanceTests that were previously failing are now fixed:
** Improved communication between master and worker nodes with respect to the acceptance tests;
** Worker thread: fixed out-of-sync handling upon initialisation;
* Improved logging, fixed the skip of some stacktraces by the JVM in particular;
* Initialisation of 5.6 worker from 5.4 image now skip "entityIdSize" and InferencerCRC from owlim.properties.

* *New context indices* can be used to improve query performance when data is modeled using many named graphs. These are switched on and off using a single configuration parameter {{enable-context-index}} - see the [configuration section|GraphDB-SE Configuration]
h1. Version 5.6 beta 3 (build 7659)

* The *SPARQL 1.1 Graph Store HTTP Protocol* is now supported according to the [W3C Working Draft|http://www.w3.org/TR/sparql11-http-rdf-update/] from the 12th May 2011. This provides a REST interface for managing collections of graphs, using either directly or indirectly named graphs.
Fixes:
* cluster - empty worker initialisation;
* worker - initial update handling;
* log sync: 10s wait between idle rounds (network bandwidth optimisation);
* Tx log: initialisation bug fixed;
* update might fail when replication is in progress;
* miscellaneous bug fixes in the cluster utils (deployment/status) and proxies restart;
* detailed logging:
** replication cluster worker events;
** HTTP client stats;
** Tx log initialisation.

* *Sesame 2.6.5* with many bug-fixes and updates to bring SPARQL 1.1 Query support up to the latest W3C Working Draft from the 5th January 2012.
Known issues:
* AcceptanceTests failing: W4, M4, MW3, MW7, MW8;
* the new/experimental LVM backup/restore feature is not yet ported from 5.4 (and thus MW10 and MW11 Acceptance Tests are not implemented, because they are based on it).

* *Significant reduction in disk-space requirements* is achieved with the following modifications:
** *Index compression* can now be used to reduce disk storage requirements by using zip compression on database pages. This feature if off by default, but can be switched on when creating a new repository. The configuration parameter {{index-compression-ratio}} can be set to -1 (the default value indicating no compression) or a value in the range [10-50] indicating the desired percentage reduction in page sizes. Any pages that can not be compressed by the specified amount are stored uncompressed. Therefore a compression ratio that is too aggressive will not bring many benefits. Experiments have shown that for large datasets a value of about 30% is close to optimal.
** *Restructuring of the triple indices* has also led to a reduction in disk-space requirements of around 18% independent of the compression functionality
** *Entity compression* is a modification that reduces the storage requirements for the lookup table that maps between internal identifiers and resources. This is transparent to the user and happens automatically. More disk space reductions are apparent using this version.
h1. Version 5.6 beta 2 (build 7523)

* A new *literal index* is created automatically for numeric and date/time data-types. The index is used during query evaluation only if a query or a subquery (e.g. union) has a filter that is comprised of a conjunction of literal constraints, e.g. FILTER(?x >= 3 && ?y <= 5 && ?start > "2001-01-01"^^xsd:date). Other patterns, including those that use negation, will not use the index for this version of OWLIM.
Fixes:
- updated AcceptanceTests in the MastersAndWorkers section;
- replication start/wait methods improved;
- several fixes to the TxLog protocol;
- fixed replication logic to delete the Worker repo, only when the remote worker confirms the replication;
- additional sanity checks added to the Master-to-Master and Master-to-Worker synchronisation;
- improved logging, incl. "SPLITBRAIN" events logged both to logs and to JMX.

* All *control queries now use SPARQL Update syntax* (used mostly to control the Lucene-based full-text search, RDF Rank and geo-spatial plug-ins). This has a number of advantages, namely:
** No special control query pseduo-graph is required by the Replication Cluster master in order to identify control queries that must be pushed to all worker nodes
** SPARQL Updates use the corresponding SPARQL update protocol, so they can be automatically processed by load-balancers that examine URL patterns
** It is more consistent with the SPARQL language, since these 'control queries' cause a change of state in OWLIM
Known issues:
- some MW\* tests with the forced replication fail randomly, but rarely - related to the Proxy tool.

* *Incremental Lucene-based full-text search index* for updating the index for specific resources or all un-indexed resources. Using this technique can avoid the more expensive approach of rebuilding the whole index frequently.
h1. Version 5.6 beta 1 (build 7368)

* *Incremental RDF Rank* allows the RDF rank for specific resources to be (re-)computed as directed by the user. This technique can avoid the more expensive approach of rebuilding all RDF Rank values frequently.
Ontotext redesigned its cluster architecture to support the case of two or more separate data centres (each with its own Master and Worker nodes), and to provide asynchronous transactions and Master fail-over. OWLIM Enterprise already supported Master-Worker clusters with Automatic Replication, Load-balancing and Transaction Logs, but in this release these components are improved. OWLIM 5.6 is based on 5.5 and inherits its write performance improvements.
* [OWLIM56:Client Fail-over Utility|OWLIM56:Client Fail-over Utility], which can be configured to fallback to the next master, if the first master becomes unavailable;
* Better TransactionLog support (see: [OWLIM56:Transaction Log Improvements]) - the updates are synchronised between all masters in all data centers;
* All Masters are now Read/Write;
* [OWLIM56:Smart Replication];
* Protocol backwards compatibility - the ability to upgrade the OWLIM cluster without downtime, following the OWLIM Upgrade Procedure;
* [External Plug-ins|OWLIM56:External Plug-ins] \- the plugins in OWLIM are moved into a separate plugin directory, and now can be upgraded/maintained separately.

* The *Geo-spatial index has been updated* to support 40-bit resource identifiers.
h2. Known issues:

* The *getting started* application has been restructured so that it now works with remote repositories.
* Transaction consistency concern. In the new cluster, the Master responds to an update from a client as soon as the test node completes it.

* OWLIM-Enterprise also includes the following maintenance updates and fixes:
** Bugs
*** OWLIM-703 Getting started in OWLIM-Enterprise distribution zip has incorrect sail type in owlim.ttl
*** OWLIM-669 Sesame workbench no longer provides OWLIM options when creating a new repository
*** OWLIM-624 Entering the license file in Sesame Workbench requires backslashes to be escaped (twice)
** New features
*** OWLIM-671 Port the custom "Update Timeout" functionality into 4.3 and 5.0 branches
*** OWLIM-636 Improved software license validation
** Other
*** OWLIM-629 Make Getting Started more robust
*** OWLIM-522 Refactor LUBM test drivers
*** OWLIM-519 Drop Java 1.5 compatibility (change to Java 1.6)
*** OWLIM-498 Format and clean-up entire OWLIM code-base using eclipse formatter
*** OWLIM-422 Reformulate control queries to use SPARQL 1.1 Update syntax

Known problems with OWLIM 5.0 BETA 3

* The behaviour of the 'include inferred' checkbox in the Sesame Workbench is unpredictable when using OWLIM repositories.

h1. Version 4.3

Further contributions to the Sesame framework from Ontotext and [Fluid Operations|http://www.fluidops.com/] mean that [Sesame version 2.6|http://www.openrdf.org/news.jsp#sesame-2.6.0] is included with this version of OWLIM. The following new features are available:

* SPARQL 1.1 Federation support that allows queries to pull together data from any number of distributed SPARQL endpoints
* A new SPARQL repository type to wrap SPARQL endpoints
* Improvements to the parser for controlling the level of literal/data-type validation and the handling of errors
* Many other fixes for compliance with the latest revised SPARQL 1.1 working drafts

OWLIM has now has a plug-in API that allows users to build software components that alter the behaviour of OWLIM. This mechanism can be used to add new features or to improve performance in certain scenarios.

OWLIM also includes the following maintenance updates and fixes:

* OWLIM-205 - Validate literal languages and do not allow invalid language tags to enter the repository
* OWLIM-273 - Potential thread leak in QueryModelConverter
* OWLIM-390 - Counting statements using Sesame API gives strange results.
* OWLIM-419 - Make RepositoryConnection.exportStatements obey the time limit
* OWLIM-426 - Unable to permanently remove predefined namespace definitions
* OWLIM-428 - Explicit axioms don't show up as explicit if they have been inferred before by other axioms
* OWLIM-463 - Clear transaction log in replication cluster if it cannot be initialized
* OWLIM-466 - SesameConnectionImpl.getStatements must return quads, not trips (breaks workbench explore)
* OWLIM-470 - Query with Union and optional returns wrong results
* OWLIM-471 - Can not access new repository when FTS switched on (divide by zero or lockfile locked)
* OWLIM-473 - onto:explicit pseudo-graph does not prevent implicit statements as input for query answering
* OWLIM-475 - Repackaged console.sh in openrdf-console.zip has lost its execute attribute
* OWLIM-476 - Neither of the slf4j jars (api or jdk14) are needed in the war files
* OWLIM-483 - Lost solutions to queries with FROM <...> clause
* OWLIM-485 - Repository with many transactions fails to get restored
* OWLIM-488 - Incorrect behaviour of FROM and FROM NAMED in SPARQL queries
* OWLIM-489 - Predicate list indices do not log statistics
* OWLIM-490 - User-supplied Dataset object on query not properly handled
* OWLIM-491 - Query rewriting in MainQuery.convertToOptimizedForm() converts OR to AND in filters when converting the condition to disjunctive normal form
* OWLIM-495 - Blank node contexts ignored by getStatements()
* OWLIM-501 - Lucene and OPTIONAL query bug
* OWLIM-502 - The database restorer deletes the pso and pos files after second unsuccessful restore
* OWLIM-457 - Validate data-type values at load time
* OWLIM-497 - Update getting-started and add timestamps
* OWLIM-356 - Optimized rule set is not compatible with the rule compiler.
* OWLIM-480 - Make use of the com.ontotext.trree.collections for the predicate map in order to reuse the file header and the common interface


h1. Version 4.2

Ontotext have continued to invest in the Sesame project and are pleased to announce the inclusion of [Sesame version 2.5|http://www.openrdf.org/news.jsp#sesame-2.5.0] with this version of OWLIM. The benefits include:

* [SPARQL 1.1 Update|http://www.w3.org/TR/sparql11-update/] - this extension of SPARQL provides a [much more powerful method to modify|http://www.ontotext.com/owlim/sparql11] RDF databases without the requirement for developers to use frameworks and APIs.
* SPARQL 1.1 Query conformance has been updated to the [May 2011 working draft|http://www.w3.org/TR/sparql11-query/], i.e. all the remaining behaviour has been implemented along with all the new SPARQL filter functions.
* The SPARQL protocol has also been updated to [January 2010 working draft|http://www.w3.org/TR/sparql11-protocol/].
* A new binary RDF serialization format. This format has been derived from the existing binary tuple results format. It's main features are reduced parsing overhead and minimal memory requirements.

As well as integration with the new Sesame APIs and modifications for optimising SPARQL Update, there have also been a number of bug fixes in this version of OWLIM-Enterprise:

* OWLIM-396 - A RuntimeException is thrown in clearNamespaces() in SailConnection
* OWLIM-404 - HashEntityPool fails to store/read its entity index table if its size is more than ~500M
* OWLIM-408 - Getting of default namespace doesn't work
* OWLIM-440 - Can not create geo-spatial index when using OWLIM-SE with Tomcat
* OWLIM-443 - Repository fails to start - entity pool error
* OWLIM-445 - disable-sameAs causing query evaluation to lose bindings
* OWLIM-446 - Query.setIncludeInferred() is ignored
* OWLIM-447 - License file can not be specfied - default evaluation license is always used.
* OWLIM-449 - Wrong conversion from int to long in com.ontotext.trree.plugin.lucene.LuceneIterator
* OWLIM-452 - Multiple wrong results are returned for a CONSTRUCT query
* OWLIM-454 - EntityStorageVersion3 fails to restore if a long entity has negative size.
* OWLIM-455 - Cannot put any more statements in AVL tree after ~3.1B statements added during 3.5-to-4.0 conversion
* OWLIM-305 - Rationalise OWLIM vocabulary


h1. Version 4.1

This maintenance release includes Sesame 2.4.2, which fixes several important bugs in SPARQL 1.1 Query support:
* [Query performance degradation with assertions enabled|http://www.openrdf.org/issues/browse/SES-797]
* [No Query Result if Aggregate Function used with OPTIONAL|http://www.openrdf.org/issues/browse/SES-798]
* [GROUP BY with complex expression not evaluated correctly|http://www.openrdf.org/issues/browse/SES-774]
* [SUM() with GROUP BY on empty solution results in error|http://www.openrdf.org/issues/browse/SES-786]
* [IN operator fails on empty argument list|http://www.openrdf.org/issues/browse/SES-787]
* [ArbitraryLengthPath with lowerbound 0 fails when no zero-length match is found|http://www.openrdf.org/issues/browse/SES-791]
* [SPARQL parser constructs incorrect query model for some property paths involving alternatives|http://www.openrdf.org/issues/browse/SES-792]
* [SPARQL parser fails to introduce implicit grouping on some queries|http://www.openrdf.org/issues/browse/SES-793]
* [Let SUM operator silently ignore non-numeric arguments|http://www.openrdf.org/issues/browse/SES-788]

Also included are some updates to OWLIM-SE:
* Unexpected binding returned in a Sparql query with union within an optional expression
* FILTER in OPTIONAL patterns returns incorrect results
* Aggregate SPARQL query fails with IndexOutOfBoundsException
* Default and named graphs set in a SPARQL query are ignored by the Jena connector


h1. Version 4.0

* *OWLIM Replication Cluster* has been renamed to *OWLIM-Enterprise* and is distributed separately from *OWLIM-SE*. This new name better identifies this software component as the flagship product of the OWLIM family suitable for mission critical applications.
* *Easy to deploy WAR files:* The distribution now includes {{openrdf-sesame}} and {{openrdf-workbench}} Web applications pre-configured with OWLIM and ready to deploy. This makes installing OWLIM as a server and creating/administrating OWLIM repositories trivially simple. The WAR files can be found in the {{sesame_owlim}} directory of the distribution ZIP file. See 'easy install' in the [installation section|OWLIM-Enterprise Installation].
* *SPARQL 1.1 Query:* Ontotext has invested significant development resources in the [Sesame project|http://www.openrdf.org/] in order to bring SPARQL 1.1 support to all editions of OWLIM. Since OWLIM-Enterprise is a distributed architecture based on OWLIM-SE, OWLIM-Enterprise also includes [SPARQL 1.1 Query|http://www.w3.org/TR/sparql11-query/], but without federation support for the moment. [SPARQL 1.1 Update|http://www.w3.org/TR/sparql11-update/] support will be included in the next release. The new features include:
** Aggregates
** Subqueries
** Negation
** Expressions in the SELECT clause
** Property Paths
** Assignment
** A short form for CONSTRUCT
** An expanded set of functions and operators
* The SPARQL 1.1 specification has not yet become a W3C recommendation and continues to evolve. The following known issues apply to this release of OWLIM and Sesame:
** {{fn:concat}} is not supported. This was added to the working draft in May, just after the Sesame 2.4.0 release was finalised. It will likely be included in the next Sesame/OWLIM release.
** Empty IN() and NOT IN() clauses will cause an exception - will be fixed in the next release.
** Using the aggregate function SUM() will cause an exception if the there are no bindings over which to do the summation - will be fixed in the next release.
** Federation is not yet supported. This will be implemented in a later version of Sesame and OWLIM later this year.
** There are some problems with [complex expressions|http://www.w3.org/TR/sparql11-query/#selectExpressions] in the SELECT clause. This should be fixed in the next release of Sesame/OWLIM.

h1. Version 3.5

This release includes many bug fixes, several new features and updates:
* *Write-only worker node:* When worker nodes are added to the cluster via the JMX interface, they can be specified as being 'write-only'. These nodes will be kept in synch with the rest of the cluster, but will not take part in answering cluster queries. The motivation for this feature is to have one or more worker nodes available for batch processing of queries that do not affect the overall query performance of the cluster.
* *Remote notifications:* A new mechanism to complement the existing high-performance 'in-process' notification mechanism. This new mechanism allows clients to subscribe for the given statement patterns to OWLIM Replication Cluster master nodes.
* *Online documentation:* As well as the PDF format user guides included in the OWLIM distribution zip files, [the latest documentation for all editions of OWLIM is now available online|Home].


h1. Version 3.4

* *Replication cluster introduced in this version of OWLIM:* brings resilience, failover and horizontally scalable parallel query processing. A master node component is included that can manage a cluster of worker nodes (standard BigOWLIM instances) to synchronise updates, cater for node failure, dynamically add/remove worker nodes and distributed query requests. Such a setup allows for massive concurrent query performance where the number of queries processed per second scales almost linearly with the number of worker nodes
In a single threaded scenario the next query can be evaluated on a node that still has either not received it or not completed it which could lead to inconsistency from the client application point of view. This deviates from the update processing in 5.4, where the response is created after the last of available nodes completes it\[OWLIM-1483\].