Test Suite Improvements

Skip to end of metadata
Go to start of metadata
Search
This documentation is NOT for the latest version of GraphDB.

Latest version - GraphDB 7.1

GraphDB Documentation

Next versions

GraphDB 6.2
GraphDB 6.3
GraphDB 6.4
GraphDB 6.5
GraphDB 6.6
GraphDB 7.0
GraphDB 7.1

Previous versions

OWLIM 5.4
OWLIM 5.3
OWLIM 5.2
OWLIM 5.1
OWLIM 5.0
OWLIM 4.4
OWLIM 4.3
OWLIM 4.2
OWLIM 4.1
OWLIM 4.0

Test life cycle

  • setUp() - will show "setUp()" in the logs;
    • Cluster verification (marked with prefix "CV"). If this fails, the test will be skipped.
    • Start of continuous queries (marked with prefix "CQ").
  • Actual test
    • starts with ">>>>> testXXX";
    • ends with ">>>>> done: testXXX";
    • if the "done" part is missing, then the test has either been skipped or has failed. This is done in order to detect skipped tests via failed JUnit/assumptions because the assumptions are not logged anywhere.
  • tearDown() - will show "tearDown()" in the logs.
  • stop continuous queries (CQ). This may take some time, especially for write queries, if the whole cluster is not yet syncrhonized/replicated.
    • show CQ statistics: R/W queries executed;
    • show CQ failures: R/W queries failed (if there aren't any, this is omitted).

Logging of the life cycle

  • This is logged to the console via Log4J.
  • Example output - testM2 - (comments on the left before "->"):
        [junit] 2014-08-14 18:23:49,178 [main] INFO  test.logger - ===========================================
    ** setUp->    
        [junit] setUp(): Thu Aug 14 18:23:49 EEST 2014
        [junit] 2014-08-14 18:23:49,507 [main] INFO  test.logger - nodes started
    ** CV is OK ->
        [junit] 2014-08-14 18:23:49,547 [main] INFO  test.logger - client API reinitialized
        [junit] 2014-08-14 18:23:49,738 [main] INFO  test.logger - CV: OK. retries = 0, stale retries = 0
        [junit] 2014-08-14 18:23:49,767 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Starting background queries. Threads: 4R/1W
    ** Test Name ->
        [junit] 2014-08-14 18:23:51,768 [main] INFO  test.logger - >>>>>>>>>> testM2
        [junit] 2014-08-14 18:23:51,811 [main] INFO  test.logger - Stress node: master3, action: 4
    ** Stress event ->
        [junit] 2014-08-14 18:23:51,811 [main] INFO  test.logger - - stress: Disk space limit on node: master3
    ** Test finished OK ->
        [junit] 2014-08-14 18:26:01,391 [main] INFO  test.logger - >>>>>>>>>> done: testM2
        [junit] 2014-08-14 18:26:01,391 [main] INFO  test.logger - --------------------
    ** tearDown()->
        [junit] tearDown() - will verify cluster statements & fingerprint - Thu Aug 14 18:26:01 EEST 2014
        [junit] 2014-08-14 18:26:01,391 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Stopping background queries...
        [junit] 2014-08-14 18:26:01,391 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: [WriterThread-024] stopping thread
        [junit] 2014-08-14 18:26:01,391 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: [ReaderThread-025] stopping thread
        [junit] 2014-08-14 18:26:01,392 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: [ReaderThread-026] stopping thread
        [junit] 2014-08-14 18:26:01,392 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: [ReaderThread-027] stopping thread
        [junit] 2014-08-14 18:26:01,392 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: [ReaderThread-028] stopping thread
        [junit] 2014-08-14 18:26:01,394 [cq-stop-029] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Waiting for WriterThread-024...
        [junit] 2014-08-14 18:26:01,395 [cq-stop-029] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Waiting for ReaderThread-025...
        [junit] 2014-08-14 18:26:01,395 [cq-stop-029] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Waiting for ReaderThread-026...
        [junit] 2014-08-14 18:26:01,395 [cq-stop-029] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Waiting for ReaderThread-027...
        [junit] 2014-08-14 18:26:01,395 [cq-stop-029] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Waiting for ReaderThread-028...
        [junit] 2014-08-14 18:26:01,395 [cq-stop-029] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Stopped background queries in 3ms
        [junit] 2014-08-14 18:26:01,395 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Statistics:
    ** CQ Stats->
        [junit] 2014-08-14 18:26:01,395 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Executed queries: 2470R, 7W in 131627 ms.
    ** CQ Failures->
        [junit] 2014-08-14 18:26:01,396 [main] WARN  com.ontotext.test.utils.ContinuousQueries - CQ: Failed continues queries: 3R/0W.
    ** CV Retries->
        [junit] 2014-08-14 18:26:01,770 [main] DEBUG test.logger - CV: 600 retries left, 10 stale retries left.
        [junit]
        [junit] 2014-08-14 18:26:12,959 [main] DEBUG test.logger - CV: 590 retries left, 8 stale retries left.
        [junit]
    ** CV finally OK ->
        [junit] 2014-08-14 18:26:17,476 [main] INFO  test.logger - CV: OK. retries = 13, stale retries = 5
    

Continuous query execution

The test suite now continuously executes queries and updates in a number of threads while running the tests. These queries and updates are more complex when taken from the LDBC-50M suite. We currently start 4 Read threads and 1 Write thread. All of the read queries from LDBC SPB are used. Only the insert queries are used for Writing. This is related to:
a) the fact that we use a cache of CWorks, which will be invalidated if we delete a CWork that belongs to it;
b) the cluster validation expects workers that are lagging to only increase their statements - this was mostly to prevent waiting long times before deciding that the cluster is broken.

The fact that we do not use DELETE/Update queries is not a limitation of this tests. They are used as part of the load test.

The following statistics will be printed at the end of each test:

2014-08-14 16:16:50,122 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Statistics:
2014-08-14 16:16:50,122 [main] INFO  com.ontotext.test.utils.ContinuousQueries - CQ: Executed queries: 4R, 2W in 3039 ms.

JMX Notification logging

The test tool gathers the JMX notifications from all masters and logs them in the notifications.log file. This is done so that later, special events can be checked such as replication or split brain.

Log4j

The test tool now uses Log4j to log the output. This makes it easier to control what to log and where to log it.

Cluster Verification: Improved cluster stabilisation detection

Verify that the cluster is in good shape at least twice per test: before you start it and after the test is finished. The old verification process was the following:

  1. Check if all nodes are on-line and if they have the same number of statements and fingerprints.
  2. If the check passes, the cluster is OK. Stop.
  3. If this process is repeated too many times, then the cluster is in a bad state. Stop.
  4. Otherwise, sleep for some time and go back to step #1.

The new verification process is similar but with two important differences:

  1. You still check the states but the waiting period is only 1 second. This enables you to pick up a "good" cluster quite fast.
  2. You count stale states (these are states that are the same as the previous state). If there are too many consecutive stale states (at the moment this is set to 10), you can stop the process early. This way you do not have to wait too long, if the cluster is "frozen".
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.