Skip to end of metadata
Go to start of metadata

Problem

When we implemented Bookmarks, Delete proved to be a very slow operation: the first delete transaction takes 10s, subsequent ones take 5s.

  • Since Update in Sparql is implemented as Insert then Delete, this affects Update as well.

Note: Delete All is implemented as a single transaction; and the speed almost doesn't depend on how many bookmarks are deleted

Resolution

Profiling the slow operation in Java pointed the problem at the used ruleset. OWLIM Incremental Retract does backward chaining in order to find all consequences of a deleted statement, and delete those that have no alternative support; then recurse.

  • a lot of testing and trials showed that delete is fast with RDFS. OWL-Horst, and the specific FR-Implementation ruleset that we use
  • the error turned out to be that the repo is opened with OWL2-RL, instead of the FR-Implementation ruleset used to create it.
  • OWL2-RL involves very complex reasoning, which makes the incremental retract slow.
  • Note: FRs need to be inferred only while loading initial data, not while adding annotations/bookmarks.
    So we could open the repository with RDFS+inverse (not the full FR-Implementation) to speed up a bit.

Marking Data Read-Only

The above documentation says

The forward chaining part of the Incremental Retract algorithm terminates as soon as it detects that a statement is read only, because if it cannot be deleted there is no need to look for statements derived from it. For this reason, performance can be greatly improved.

The Repository Creation app loads a bunch of files to the repo: ontologies, thesauri & museum data (BM and RKD). This data never changes (user-created data like annotations, bookmarks, searches, etc is different). So if we mark them all as read-only statements, that should increase delete/update performance. To do this, include an onto:schemaTransaction statement in the transaction that loads the data, eg:

[] <http://www.ontotext.com/owlim/system#schemaTransaction> [].

This might have some negative effect on Repository Creation Performance, as the documentation says "Such transactions are likely to be be much more computationally expensive to achieve". But we hope not, since there is no Delete during the initial repo load, only Insert.

Other Approaches

We discussed some other approaches:

  • put each Bookmark in a separate graph (context) and delete by graph
  • simplify the FR-Implementation ruleset by moving more of it into statements (eg simplified PropertyChainAxioms of 2 & 3 properties). Not sure whether this speed up delete, given that the same number of statements will be inferred
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.