Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version. Compare with Current  |   View Page History

Counting and analysis of repository content

There are currently no attachments on this page.

Counting

  • Total statements
    select (count(*) as ?c) {?s ?p ?o}
    
  • statements per property.
    The max limit=200, so we get them in two portions:
    select ?p (count(*) as ?c) {?s ?p ?o} group by ?p order by ?p
    select ?p (count(*) as ?c) {?s ?p ?o} group by ?p order by ?p offset 200
    
  • class instances (one instance has many rdf:type!)
    select ?t (count(*) as ?c) {?s rdf:type ?t} group by ?t order by ?t
    

Analysis

We provide historic data, but focus on the latest data (BM-triples.xls of 2012-12).

Properties


Wihout sameAs expansion: 89995389 (2.9M=3.1% less triples)

  • rdf:type=58426160 is 62.9% of all triples (see breakdown below)
  • Object (business) & thesauri triples are 26.0+4.9=30.9%, of which we can assume objects are 21% and thesauri 10%.
  • FRs=5751214 are 6.2% of all triples, or 29% of business triples
  • bmo:PX_physical_description=25584 ~ rso:FC70_Thing=23993 is 3x more than the 8k objects!? Due to owl:sameAs
  • owl:sameAs=72010 is 9x more than the 8k objects.
    Each object has 3 sameAs URIs (a,b,c), which causes 9 statements: aa bb cc ab bc ca ba cb ac
    That's what an equivalence relation will do to you.
  • skos:inScheme=357283 ~ skos:Concept=357318 is the total number of thesaurus terms
  • skos:exactMatch=4495 come from RKD. E.g. rkd-plaats:renaix and rkd-plaats:renaix give 4 triples (2 symmetric, 2 reflexive)

Types

  • _:nodeXX=23528903: 40.3% useless OWL DL restriction types
     crm:En_Whatever rdf:type [owl:Restriction...] 

    We could eliminate these (24% of all triples) by:

    1. Delete such statements after loading the ontologies and before loading the data
      delete where {?e rdfs:subClassOf ?t. ?t a owl:Restriction}
      
    2. Write a perl script to cut down ECRM to RDFS+inverse (what Doerr wanted) + transitive
  • CRM classes=30864964: 52.8%: this is broken down into a decreasing number down the class hierarchy (ok):
    owl:Thing=3627096 ~ crm:E1_CRM_Entity=3626903
    crm:E77_Persistent_Item=3092726
    crm:E2_Temporal_Entity=240162

Statements, 1.5M estimate

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.