GraphDB-Enterprise Release Notes

compared with
Current by nikola.petrov
on Apr 09, 2015 17:47.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (9)

View Page History
Starting from version 6, GraphDB now includes three separate products with their own version numbers: GraphDB Engine, GraphDB Workbench and GraphDB Connectors (experimental). New features and significant bug-fixes/updates for the last few releases are recorded here. Full Each product's full version numbers are given as:

{panel}major.minor.build_number{panel}

e.g. 5.3.5928 where the major number is 5, the minor number is 3 and the build number is 5928. Releases with the same major and minor version numbers do not contain any new features. The only difference is that releases with later build numbers contain fixes for bugs discovered since the previous release. New or significantly changed features are released with a higher major or minor version number.
e.g. 5.3.5928 where the major number is 5, the minor number is 3 and the build number is 5928.
The integrated releases have their own version, e.g. 6.0-RC1.

h1. GraphDB version 6.0-RC3 (build 7914)
Releases with the same major and minor version numbers do not contain any new features. The only difference is that releases with later build numbers contain fixes for bugs discovered since the previous release. New or significantly changed features are released with a higher major or minor version number.

{toc}

h1. GraphDB 6.1-SP3
This Service Pack 3 build addresses some reported problems with 6.1

h2. GraphDB Engine 6.1-b4371143

* OWLIM-1888 Better inference performance for datasets with big number of classes and onProperty restrictions
* Fixed a problem where queries containing UNION and BIND were under certain circumstances returning incorrect results
* New merge command for the storage tool
* OWLIM-1932 Count from graph http://www.ontotext.com/count does not work for describe queries
* OWLIM-1934 Removed some old and unwanted namespaces that were included by default
* OWLIM-1928 Added reversed numeric and date literal indices for better performance
* OWLIM-1990 Worker data is deleted when a plugin fails to initialize
* Using store-backed queue to keep transaction operations on the master - this should prevent out of memory errors when importing huge files


h2. GraphDB Workbench 6.4.1
- See all the release notes here: [https://confluence.ontotext.com/display/GraphDB6/GraphDB-Workbench+Release+Notes]


h2. GraphDB Connectors 3.1.4
- See all the release notes here: [https://confluence.ontotext.com/display/GraphDB6/GraphDB+Connectors+Release+notes]


h1. GraphDB 6.1-SP1
This Service Pack 1 build addresses consistency issue with the GraphDB Storage. The cluster's backup/restore functionality is also fixed.

h2. GraphDB 6.1.8410
- extended new cluster backup/restore to support multiple named backups. Important note: the backup now uses a name instead of an absolute folder
- fixes for the master URL parameter (autodetected in most cases, thus no need to specify manually)
- minor cluster improvements (shutdown on OOM during sync; worker: handling multiple initializations and no reporting of missing fingerprints on init; fixed race condition in replication; new internal URIs (not URLs) for replication)
- fixed: [owlim-1853] LoadRDF under Windows sometimes does not include statements in the PSO index and the indexes are incosistent;
- Added a check/warning to see if there is enough memory for the entity pool when it gets restored from persistence (-Xmx - cache-memory should be >= entity pool size * 1.25, i.e. 25% overhead is left for the datatype index and other in-memory structures included in the entity pool and for other purposes, e.g. for running queries)
- Constraint validation and support for multiple rulesets is out of its experimental stage and now in production. See the docs here: [https://confluence.ontotext.com/display/GraphDB6/GraphDB+Constraint+Validation]

h2. GraphDB Workbench 6.3.2
- See all the release notes here: [https://confluence.ontotext.com/display/GraphDB6/GraphDB-Workbench+Release+Notes]


h1. GraphDB - 6.1.8316
This is an Enterprise release only.

h2. Improvements (replication cluster):
- Tx Log stability improvements & fixes - scrapped optimistic join procedure; scrapped shared horizon & completed queue; fixed status patching base
- Improved logging detail
- Updates are dispatched to the other peers immediately once accepted and enqueued
- dedicated peer dispatch threads
- fixed split-brain recovery
- Client API: Changed the master retry policy (do not check every master too often; but still make sure that when previously unavailable master becomes available we detect that)

h2. Replication improvements:
- cleanup target directory on accepting replication data
- 60s timeout in replication client
- limited retries in the replication server
- replication client startup delay patch
- retry serving replication data on failure
- exposed exceptions thrown from the replication server

h2. Other improvements:
- The inferencer debug statistics were not displayed at shutdown because we used the SwitchableInferencer's ones instead of those in currentInferencer that did the job
- Created StorageTool app, used to Scan/Rebuild indexes in case of missing statements

h3. Fixes:
- [OWLIM-1838] Custom NTriples/NQuads parser: ArrayOutOfBoundsException can be thrown on empty lines while parsing and then NPE while trying to process t.getMessage() which may be null
- [OWLIM-1834] Fixed: LoadRDF Tool doesn't work with GraphDB Enterprise
- [W-44] Introduced a parameter 'in-clause-max-members' (defaults to 16) which limits the number of entities specified in the IN clause. If the IN clause contains more elements then it won't be optimized to a UNION query
- [W-55] Fixed the ASC/DESC issue in ORDER BY clause when the variable in the ORDER BY is in an OPTIONAL clause (the issue was different number of results returned in ASC/DESC cases).


h1. GraphDB version 6.1-RC1

Highlights:

- *Various stability fixes to the cluster*, including proper master shutdown sequence error handling, error-resilient synchronization threads, safe saving of configuration properties of the cluster config. The repository fingerprint now also reflects the number of statements and there is better handling of stress events which happen during transactions & dirty shutdowns (e.g. out of disk space)

- *Much faster write transactions* for small insert/update/delete operations on large repositories. Results on LDBC [Semantic Publishing Benchmark|http://ldbcouncil.org/developer/spb] (SPB) at 50M went up from 32 read and 12 write queries per second in ver. 6.0 to 40 reads/s and 31 writes/s in ver. 6.1. The improvement gets even more visible and SPB at 1B scale: from 10 reads/s and 2 writes/s in ver. 6.0 to 11 reads/s and 10 writes/s in ver. 6.1. In summary, GraphDB 6.1 is able to handle twice more updates at 50M scale and 5 times more updates at scale of 1 billion statements. This way GraphDB 6.1 is already capable to deal with true Dynamic Semantic Publishing scenario, like the one of [BBC|http://www.bbc.co.uk/blogs/legacy/bbcinternet/2012/04/sports_dynamic_semantic.html], at a scale of 1 billion statements and higher.
See more: http://www.ontotext.com/graphdb-benchmark-results/

- *Improved load capabilities for large new datasets in live databases* instances. Scenario description: there is a production cluster with average load - running LDBC-50m (SPB) with 4 reading threads (doing select queries) and 1 writing thread (doing update queries). We need to add a large dataset (e.g. DBPedia) with hundreds of millions statements -- as fast as possible, but without disrupting the overall cluster speed too much (not introducing write latency of more than 1-2s). The data set doesn't need inference, so it is loaded with the empty rule set.
Our implementation introduces a new "magic" statement (u, u, u), where u=<[http://www.ontotext.com/useParallelInsertion]>. If this statement is inserted in the beggining of the transaction, then the data will be loaded almost twice faster (it reuses parts of the load-chain from LoadRDF tool) and also the engine will temporarily switch the ruleset to 'emtpy'. We found that splitting the data set into 50k chunks is a good compromise between high loading speed and lower latency increase of parallel updates.

- Small *improvements in the bulk loading tools (LoadRDF)*. It is possible to load different files into different contexts now, as well as provide Statements programmatically to it. See the page for the LoadRDF for the details. [GraphDB-SE LoadRDF tool|GraphDB6:GraphDB-SE LoadRDF tool].&nbsp;These improvements, combined with the increased update speed, allow us to load the English part of DBPedia 2014 (566M statements), in less than an hour at speed of 179 905 statements/sec.

- Improvements in *GraphDB Workbench*: The focus with this release was on *security* (users, roles) as well as small stability and usability improvements.

This is an integrated release that includes:
- GraphDB Engine 6.0.8274
- GraphDB Workbench 6.2.3
- GraphDB Connectors 3.1.0

h1. GraphDB version 6.0-RC6

This is an integrated release that includes:
- GraphDB Engine 6.0.8120
- GraphDB Workbench 6.2.2
- GraphDB Connectors 3.1.0

h2. GraphDB Engine 6.0.8120

The focus on this release was further improvement of the updates speed on larger transactions (10K\+ statements) as well as overall stability.

Improvements:
- The transactions are serialized to JSON instead of XML. JSON Streaming parser is used - this minimizes the memory footprint of the master nodes
- improved handling of long transactions
- bigger transactions are compressed via GZip in the transaction log and when communicated between masters and workers
- GZipped updates can be sent via curl; add "Content-Encoding: gzip" header to use them
- Rule files can now be specified via HTTP and other protocols instead of local files

Fixes:
- Fixed: OWLIM-1610 OOM on the Master with large update (600k statements)
- Fixed JMX update statistics
- Removed the obsolete Remote Master flag in the replication cluster
- Master backup folder moved over under sesame data folder
- Fixed Lucene plugin to support Custom Analyzers/Scorers (this was broken on 6.0 releases due to plugin classloader not loading jars from the plugin directory)
- [OWLIM-1730|OWLIM-1730] Fixed handling of a failed initialisation within LiteralsPlugin
- [OWLIM-1712|OWLIM-1712] Query Optimizer does not apply 'strong' equality within FILTER when bot variables are used as subject of some statement patterns
- [OWLIM-1615|OWLIM-1615] Fixed how running queries are handled on shutdown

h2. GraphDB Workbench

- the full list of changes in the latest version is available in this page: [GraphDB-Workbench Release Notes|GraphDB6:GraphDB-Workbench Release Notes]

h2. GraphDB Connectors (experimental)

- the full list of changes in the latest version is available in this page: [GraphDB6:GraphDB Connectors Release notes]


h1. GraphDB version 6.0-RC5

This is an integrated release that includes:
- GraphDB Engine 6.0.8070
- GraphDB Workbench 6.0.2
- GraphDB Connectors 3.0.0.RC2

h2. GraphDB Engine 6.0.8070


h3. Major changes:

- Improvements to the HA Cluster wrt New Cluster Tests: improved intra-cluster communications, worker initialization, status reporting, improved diagnostics and logging;
- Query monitoring via JMX - the full text of the query is now visible
- Fixes for the Constraint Violation support & multiple rulesets
- Faster update speeds
-- Now using GraphDB Custom NTriples/NQuads parser by default (so NTriples, NQuad formats are parsed faster than other formats)
-- when a transaction is using the *empty* ruleset, the commit can added to all indexes in parallel. In order to use this experiment feature, add the special system statement: \_:b <[http://owlim.ontotext.com/owlim/useParallel]> \_:b in the beginning of the transacton. This makes sense for larger transactions (10K statements and above).

h3. Full list of changes:

- OWLIM-1628 Added a fix of the issue of not being able to explore a ruleset when the empty ruleset was set initially.
- OWLIM-1626 RepositoryException in Worker is not thrown by the Master
- OWLIM-1600 Query returns no results when using FILTER and BIND(if(...)) in it.
- T-10 Implemented automatic entity pool restore procedure which can recover a truncated entity pool and removes the statements from the repo using the IDs beyond the new entity pool size
- OWLIM-1603 Owlim crashes with lock error without obvious reasons (there is no other process that might have locked the repo).
- OWLIM-1592 Queries with at least one sub-select which intersect with an ordinary block of statement patterns perform poorly because of multiple clones and transforms of the Sesame's query model to Owlim's one.
- OWLIM-1593 Fixed bug in MainQuery.clone() (when using Subselect and there are OPTIONALs)
- OWLIM-1572 Query Monitoring - show query text instead of query id
- OWLIM-1559 Fixed property path bug when same property paths are repeated in the query
- F-320 JMX: NumberOfExplicitTriples and NumberOfTriples shows \-1 even though data has been written to the triple store
- OWLIM-1563 Fixed the issue with custom ruleset + disable-sameAs=true.
- OWLIM-1559 Implemented a shortcut in the MINUS operator which allows for faster calculation when the MINUS is over two subqueries with one triple pattern (which may have filters).

h2. GraphDB Workbench

- the full list of changes in the latest version is available in this page: [GraphDB-Workbench Release Notes|GraphDB6:GraphDB-Workbench Release Notes]

h2. GraphDB Connectors (experimental)

- the full list of changes in the latest version is available in this page: [GraphDB6:GraphDB Connectors Release notes]

h1. GraphDB version 6.0-RC4

This is an integrated release that includes:
- GraphDB Engine 6.0.7914
- GraphDB Workbench 6.0.1
- GraphDB Connectors 3.0.0.RC2


h1. GraphDB version 6.0-RC3 (build 7914)

h2. Fixes:

** Improved communication between master and worker nodes with respect to the acceptance tests;
** Worker thread: fixed out-of-sync handling upon initialisation;
* Improved logging, fixed the skip of some stacktraces by the JVM in particular;
* Initialisation of 5.6 worker from 5.4 image now skip "entityIdSize" and InferencerCRC from owlim.properties.


Ontotext redesigned its cluster architecture to support the case of two or more separate data centres (each with its own Master and Worker nodes), and to provide asynchronous transactions and Master fail-over. OWLIM Enterprise already supported Master-Worker clusters with Automatic Replication, Load-balancing and Transaction Logs, but in this release these components are improved. OWLIM 5.6 is based on 5.5 and inherits its write performance improvements.
* [OWLIM56:Client Fail-over Utility|OWLIM56:Client Fail-over Utility], which can be configured to fallback to the next master, if the first master becomes unavailable;
* Better TransactionLog support (see: [OWLIM56:Transaction Log Improvements]) - the updates are synchronised between all masters in all data centers;
* All Masters are now Read/Write;