GraphDB-Enterprise Release Notes

Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version. Compare with Current  |   View Page History

Starting from version 6, GraphDB now includes three separate products with their own version numbers: GraphDB Engine, GraphDB Workbench and GraphDB Connectors (experimental). New features and significant bug-fixes/updates for the last few releases are recorded here. Each product's full version numbers are given as:

major.minor.build_number

e.g. 5.3.5928 where the major number is 5, the minor number is 3 and the build number is 5928.
The integrated releases have their own version, e.g. 6.0-RC1.

Releases with the same major and minor version numbers do not contain any new features. The only difference is that releases with later build numbers contain fixes for bugs discovered since the previous release. New or significantly changed features are released with a higher major or minor version number.

GraphDB version 6.1

Highlights:

  • Varios stability fixes to the cluster, including proper master shutdown sequence error handling, error-resilient synchronization threads, safe saving of configuration properties of the cluster config. The repository fingerprint now also reflects the number of statements and there is better handling of stress events which happen during transactions & dirty shutdowns (e.g. out of disk space)
  • Much faster write transactions for small insert/update/delete operations on large repositories. Results on LDBC Semantic Publishing Benchmark (SPB) at 50M went up from 32 read and 12 write queries per second in ver. 6.0 to 37 read/sec and 28 write/sec in ver. 6.1. The improvement gets even more visible and SPB at 1B scale: from 10 reads/sec. and 2 writes/sec. in ver. 6.0 to 11 reads/sec. and 10 writes/sec. in ver. 6.1. In summary, GraphDB 6.1 is able to handle twice more updates at 50M scale and 5 times more updates at scale of 1 billion statements. This way GraphDB 6.1 is already capable to deal with true Dynamic Semantic Publishing scenario, like the one of BBC, at a scale of 1 billion statements and higher.
  • Improved load capabilities for large new datasets in live databases instances. Scenario description: there is a production cluster with average load - running LDBC-50m (SPB) with 4 reading threads (doing select queries) and 1 writing thread (doing update queries). We need to add a large dataset (e.g. DBPedia) with hundreds of millions statements – as fast as possible, but without disrupting the overall cluster speed too much (not introducing write latency of more than 1-2s). The data set doesn't need inference, so it is loaded with the empty rule set.
    Our implementation introduces a new "magic" statement (u, u, u), where u=<http://www.ontotext.com/useParallelInsertion>. If this statement is inserted in the beggining of the transaction, then the data will be loaded almost twice faster (it reuses parts of the load-chain from LoadRDF tool) and also the engine will temporarily switch the ruleset to 'emtpy'. We found that splitting the data set into 50k chunks is a good compromise between high loading speed and lower latency degradation of parallel loads.
  • Small improvements in the bulk loading tools (LoadRDF). It is possible to load different files into different contexts now, as well as provide Statements programmatically to it. See the page for the LoadRDF for the details. GraphDB-SE LoadRDF tool. These improvements, combined with the increased update speed, allow us to load the English part of DBPedia 2014 (566M statements), in less than an hour at speed of 179 905 statements/sec.
  • Improvements in GraphDB Workbench: The focus with this release was on security (users, roles) as well as small stability and usability improvements.

This is an integrated release that includes:

  • GraphDB Engine 6.0.8264
  • GraphDB Workbench 6.2.3
  • GraphDB Connectors 3.1.0

GraphDB version 6.0-RC6

This is an integrated release that includes:

  • GraphDB Engine 6.0.8120
  • GraphDB Workbench 6.2.2
  • GraphDB Connectors 3.1.0

GraphDB Engine 6.0.8120

The focus on this release was further improvement of the updates speed on larger transactions (10K+ statements) as well as overall stability.

Improvements:

  • The transactions are serialized to JSON instead of XML. JSON Streaming parser is used - this minimizes the memory footprint of the master nodes
  • improved handling of long transactions
  • bigger transactions are compressed via GZip in the transaction log and when communicated between masters and workers
  • GZipped updates can be sent via curl; add "Content-Encoding: gzip" header to use them
  • Rule files can now be specified via HTTP and other protocols instead of local files

Fixes:

  • Fixed: OWLIM-1610 OOM on the Master with large update (600k statements)
  • Fixed JMX update statistics
  • Removed the obsolete Remote Master flag in the replication cluster
  • Master backup folder moved over under sesame data folder
  • Fixed Lucene plugin to support Custom Analyzers/Scorers (this was broken on 6.0 releases due to plugin classloader not loading jars from the plugin directory)
  • [OWLIM-1730] Fixed handling of a failed initialisation within LiteralsPlugin
  • [OWLIM-1712] Query Optimizer does not apply 'strong' equality within FILTER when bot variables are used as subject of some statement patterns
  • [OWLIM-1615] Fixed how running queries are handled on shutdown

GraphDB Workbench

GraphDB Connectors (experimental)

GraphDB version 6.0-RC5

This is an integrated release that includes:

  • GraphDB Engine 6.0.8070
  • GraphDB Workbench 6.0.2
  • GraphDB Connectors 3.0.0.RC2

GraphDB Engine 6.0.8070

Major changes:

  • Improvements to the HA Cluster wrt New Cluster Tests: improved intra-cluster communications, worker initialization, status reporting, improved diagnostics and logging;
  • Query monitoring via JMX - the full text of the query is now visible
  • Fixes for the Constraint Violation support & multiple rulesets
  • Faster update speeds
    • Now using GraphDB Custom NTriples/NQuads parser by default (so NTriples, NQuad formats are parsed faster than other formats)
    • when a transaction is using the empty ruleset, the commit can added to all indexes in parallel. In order to use this experiment feature, add the special system statement: _:b <http://owlim.ontotext.com/owlim/useParallel> _:b in the beginning of the transacton. This makes sense for larger transactions (10K statements and above).

Full list of changes:

  • OWLIM-1628 Added a fix of the issue of not being able to explore a ruleset when the empty ruleset was set initially.
  • OWLIM-1626 RepositoryException in Worker is not thrown by the Master
  • OWLIM-1600 Query returns no results when using FILTER and BIND(if(...)) in it.
  • T-10 Implemented automatic entity pool restore procedure which can recover a truncated entity pool and removes the statements from the repo using the IDs beyond the new entity pool size
  • OWLIM-1603 Owlim crashes with lock error without obvious reasons (there is no other process that might have locked the repo).
  • OWLIM-1592 Queries with at least one sub-select which intersect with an ordinary block of statement patterns perform poorly because of multiple clones and transforms of the Sesame's query model to Owlim's one.
  • OWLIM-1593 Fixed bug in MainQuery.clone() (when using Subselect and there are OPTIONALs)
  • OWLIM-1572 Query Monitoring - show query text instead of query id
  • OWLIM-1559 Fixed property path bug when same property paths are repeated in the query
  • F-320 JMX: NumberOfExplicitTriples and NumberOfTriples shows -1 even though data has been written to the triple store
  • OWLIM-1563 Fixed the issue with custom ruleset + disable-sameAs=true.
  • OWLIM-1559 Implemented a shortcut in the MINUS operator which allows for faster calculation when the MINUS is over two subqueries with one triple pattern (which may have filters).

GraphDB Workbench

GraphDB Connectors (experimental)

GraphDB version 6.0-RC4

This is an integrated release that includes:

  • GraphDB Engine 6.0.7914
  • GraphDB Workbench 6.0.1
  • GraphDB Connectors 3.0.0.RC2

GraphDB version 6.0-RC3 (build 7914)

Fixes:

  • The plugins were moved to <webapps>/openrdf-sesame/WEB-INF/classes/plugins;
  • Running GraphDB under embedded Tomcat failed with NPE (because of non existing webapps/ folder).

GraphDB version 6.0-RC2 (build 7892)

Improvements:

  • Added mini LDBC Semantic Publishing Benchmark (http://ldbc.eu) into benchmark/ldbc-spb folder in the distribution;
  • The plugins are now in <webapps>/openrdf-sesame/plugins folder. Lucene plugin is enabled by default. This could be overwritten by the -Dregister-external-plugins option;
  • Minor rearrangement of the files in the main distribution folder (all .pie files are put into rules/ subfolder, the scripts into scripts/ subfolder).

Fixes:

  • Fixed issue with the default/evaluation license;
  • Fixed issue with the LoadRDF tool.

GraphDB version 6.0 (build 7784)

GraphDB 6.0 is a re-branded Owlim 5.6 version. The differences are given in the last stable Owlim 5.4 release.

Improvements:

  • High Availability Cluster;
  • Fast writes in SAFE Mode (OWLIM 5.5 improvement, which lead to incompatible binary formats between 5.4 and 5.5+);
  • LoadRDF tool for faster bulk loading of data; speeds ~100KSt/s and above, without inference;
  • Explain Plan like functionality;
  • LVM-based Backup and Replication.

Fixes:

  • Databases created with one setting of the "entity-id-size" parameter (32 vs 40-bit) and opened with another setting, would crash in versions prior to 6.0. Now an exception is thrown and the repository is not initialized.

Version 5.6 (build 7713)

Improvements:

  • [LVM-based Backup and Replication] - Backup can optionally be based on the LVM Shadow Volume Copy - which makes it faster and the worker is released a few seconds after the backup is started (ported from 5.4).
  • [New Cluster Test (cluster deployment and test tool)] - a tool for automated deployment and testing of clusters of various sizes. Can deploy on AWS and local instances. Supports docker format. Allows for acceptance, stress and load tests to be run on the deployed clusters. Optionally, creates Nagios configuration for the deployed cluster.
  • LoadRDF tool - a tool for a faster bulk loading of data, which has been merged from 5.5 branch.
  • Merged EntityPool Reverse Cache from 5.5 - speeds up large updates (100+ statements).

Fixes:

  • All AcceptanceTests that were previously failing are now fixed:
    • Improved communication between master and worker nodes with respect to the acceptance tests;
    • Worker thread: fixed out-of-sync handling upon initialisation;
  • Improved logging, fixed the skip of some stacktraces by the JVM in particular;
  • Initialisation of 5.6 worker from 5.4 image now skip "entityIdSize" and InferencerCRC from owlim.properties.

Version 5.6 beta 3 (build 7659)

Fixes:

  • cluster - empty worker initialisation;
  • worker - initial update handling;
  • log sync: 10s wait between idle rounds (network bandwidth optimisation);
  • Tx log: initialisation bug fixed;
  • update might fail when replication is in progress;
  • miscellaneous bug fixes in the cluster utils (deployment/status) and proxies restart;
  • detailed logging:
    • replication cluster worker events;
    • HTTP client stats;
    • Tx log initialisation.

Known issues:

  • AcceptanceTests failing: W4, M4, MW3, MW7, MW8;
  • the new/experimental LVM backup/restore feature is not yet ported from 5.4 (and thus MW10 and MW11 Acceptance Tests are not implemented, because they are based on it).

Version 5.6 beta 2 (build 7523)

Fixes:

  • updated AcceptanceTests in the MastersAndWorkers section;
  • replication start/wait methods improved;
  • several fixes to the TxLog protocol;
  • fixed replication logic to delete the Worker repo, only when the remote worker confirms the replication;
  • additional sanity checks added to the Master-to-Master and Master-to-Worker synchronisation;
  • improved logging, incl. "SPLITBRAIN" events logged both to logs and to JMX.

Known issues:

  • some MW* tests with the forced replication fail randomly, but rarely - related to the Proxy tool.

Version 5.6 beta 1 (build 7368)

Ontotext redesigned its cluster architecture to support the case of two or more separate data centres (each with its own Master and Worker nodes), and to provide asynchronous transactions and Master fail-over. OWLIM Enterprise already supported Master-Worker clusters with Automatic Replication, Load-balancing and Transaction Logs, but in this release these components are improved. OWLIM 5.6 is based on 5.5 and inherits its write performance improvements.

  • [OWLIM56:Client Fail-over Utility], which can be configured to fallback to the next master, if the first master becomes unavailable;
  • Better TransactionLog support (see: [OWLIM56:Transaction Log Improvements]) - the updates are synchronised between all masters in all data centers;
  • All Masters are now Read/Write;
  • [OWLIM56:Smart Replication];
  • Protocol backwards compatibility - the ability to upgrade the OWLIM cluster without downtime, following the OWLIM Upgrade Procedure;
  • [External Plug-ins] - the plugins in OWLIM are moved into a separate plugin directory, and now can be upgraded/maintained separately.

Known issues:

  • Transaction consistency concern. In the new cluster, the Master responds to an update from a client as soon as the test node completes it.

In a single threaded scenario the next query can be evaluated on a node that still has either not received it or not completed it which could lead to inconsistency from the client application point of view. This deviates from the update processing in 5.4, where the response is created after the last of available nodes completes it[OWLIM-1483].

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.