Experimental Explain Plan

Skip to end of metadata
Go to start of metadata
This documentation is NOT for the latest version of GraphDB.

Latest version - GraphDB 7.1

GraphDB Documentation

Next versions

GraphDB 6.6
GraphDB 7.0
GraphDB 7.1

Previous versions

GraphDB 6.4
GraphDB 6.3
GraphDB 6.2
GraphDB 6.0 & 6.1

OWLIM 5.4
OWLIM 5.3
OWLIM 5.2
OWLIM 5.1
OWLIM 5.0
OWLIM 4.4
OWLIM 4.3
OWLIM 4.2
OWLIM 4.1
OWLIM 4.0

The Experimental Explain Plan is the new improved feature introduced in GraphDB versions 6.4.3. Both explain plans are available in verions 6.4.3, 6.4.4 and 6.5.

Activating the Experimental Explain Plan

To see the query explain plan, use the onto:experimental-explain pseudo-graph:

PREFIX onto: <http://www.ontotext.com/>
select * from onto:experimental-explain
...

Simple explain plan

For the simplest query explain plan possible (``?s ?p ?o``), execute the following query:

PREFIX onto: <http://www.ontotext.com/>
select * from onto:experimental-explain {       
   ?s ?p ?o .    
}

Depending on the number of triples that you have in the database, the results will vary, but you will get something like the following:

SELECT ?s ?p ?o
{

  { # ----- Begin optimization group 1 -----        
    ?s ?p ?o . # Collection size: 108.0                   
    # Predicate collection size: 108.0                   
    # Unique subjects: 90.0                   
    # Unique objects: 55.0                   
    # Current complexity: 108.0      

   } # ----- End optimization group 1 -----
      # ESTIMATED NUMBER OF ITERATIONS: 108.0

 }

This is the same query, but with some estimations next to the statement pattern (1 in this case).

The query might not be the same as the original one. See below the triple patterns in the order in which they are executed internally.
  • ----- Begin optimization group 1 ----- - indicates starting a group of statements, which most probably are part of a subquery (in the case of property paths, the group will be the whole path);
  • Collection size - an estimation of the number of statements that match the pattern;
  • Predicate collection size - the number of statements in the database for this particular predicate (in this case, for all predicates);
  • Unique subjects - the number of subjects that match the statement pattern;
  • Unique objects - the number of objects that match the statement pattern;
  • Current complexity - the complexity (the number of atomic lookups in the index) the database will need to make so far in the optimisation group (most of the time a subquery). When you have multiple triple patterns, these numbers grow fast.
  • ----- End optimization group 1 ----- - the end of the optimisation group;
  • ESTIMATED NUMBER OF ITERATIONS: 108.0 - the approximate number of iterations that will be executed for this group.

Multiple triple patterns

The result of the explain plan is given in the exact order the engine is going to execute the query.

The following is an example where the engine reorders the triple patterns based on their complexity. The query is a simple join:

PREFIX onto: <http://www.ontotext.com/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

select *
from onto:experimental-explain
{        
     ?o rdf:type ?o1 .        
     ?o rdfs:subPropertyOf ?o2    
}

and here is the output:

SELECT ?o ?o1 ?o2
{

  { # ----- Begin optimization group 1 -----        
    ?o rdfs:subPropertyOf ?o2 . # Collection size: 20.0                                    
                                # Predicate collection size: 20.0                                    
                                # Unique subjects: 19.0                                    
                                # Unique objects: 18.0                                    
                                # Current complexity: 20.0        
    ?o rdf:type ?o1 . # Collection size: 43.0                          
                      # Predicate collection size: 43.0                          
                      # Unique subjects: 34.0                          
                      # Unique objects: 7.0                          
                      # Current complexity: 860.0      
   } # ----- End optimization group 1 -----
   # ESTIMATED NUMBER OF ITERATIONS: 25.294117647058822

}

Understanding the output:

  • ?o rdfs:subPropertyOf ?o1 has a lower collection size (20 instead of 43), so it will be executed first.
  • ?o rdf:type ?o1 has a bigger collection size (43 instead of 20), so it will be executed second (although it is written first in the original query).
  • The current complexity grows fast because it multiplies. In this case, you can expect to get 20 results from the first statement pattern and then you have to join them with the results from the second triple pattern, which results in the complexity of 20 * 43 = 860.
  • Although the complexity for the whole group is 860, the estimated number of iterations for this group is 25.3.

Wine queries

All of the following examples refer to our simple wine dataset (wine.ttl). The file is quite small, but here is some basic explanation about the data:

  • There are different types of wine (Red, White, Rose).
  • Each wine has a label.
  • Wines are made from different types of grapes.
  • Wines contain different levels of sugar.
  • Wines are produced in a specific year.

First query with aggregation

A typical aggregation query contains a group with some aggregation function. Here, we have added an experimental-explain graph:

# Retrieve the number of wines produced in each year along with the year
PREFIX onto: <http://www.ontotext.com/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX : <http://www.ontotext.com/example/wine#>
SELECT (count(?wine) as ?wines) ?year
from onto:experimental-explain
WHERE {
    ?wine rdf:type :Wine .
    optional {            
        ?wine :hasYear ?year        
    }
}
group by ?year
ORDER BY DESC(?wines)

When you execute the query on GraphDB, you get the following as an output (instead of the real results):

SELECT (COUNT(?wine) AS ?wines) ?year
{

  { # ----- Begin optimization group 1 -----        
    ?wine rdf:type onto:example/wine#Wine . # Collection size: 5.0                                                
                                            # Predicate collection size: 64.0                                                
                                            # Unique subjects: 50.0                                                
                                            # Unique objects: 12.0                                                
                                            # Current complexity: 5.0      
   } # ----- End optimization group 1 -----
   # ESTIMATED NUMBER OF ITERATIONS: 5.0

      OPTIONAL
      {

        { # ----- Begin optimization group 2 -----        
  
          ?wine onto:example/wine#hasYear ?year . # Collection size: 5.0                                                  
                                                  # Predicate collection size: 5.0                                                  
                                                  # Unique subjects: 5.0                                                  
                                                  # Unique objects: 2.0                                                  
                                                  # Current complexity: 5.0        

        } # ----- End optimization group 2 -----
        # ESTIMATED NUMBER OF ITERATIONS: 5.0

     }
}
GROUP BY ?year
ORDER BY DESC(?wines)
LIMIT 1000
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.