Skip to end of metadata
Go to start of metadata

RS-1370

Name Size Creator Creation Date Comment  
File ecrm-simplified.owl 236 kB Vladimir Alexiev Feb 15, 2013 13:52    
File ecrm-inverses.owl 236 kB Vladimir Alexiev Dec 18, 2012 16:38    
File ecrm-simplify.xq 3 kB Vladimir Alexiev Dec 18, 2012 16:38    

To: ECRM, CRM SIG. Date: 12/18/2012
Subject: ECRM simplification; RDFS+inverses CRM version

ecrm-simplify.xq

I wrote a XQuery script ecrm-simplify.xq that simplifies the ecrm-current.owl file by removing some OWL constructs.
It always keeps owl:inverseOf and owl:SymmetricProperty (self-inverses): they are an innate part of CRM.
Parameter keep= gives a comma-separated list of other features to keep:

  • transitive: owl:TransitiveProperty
  • restriction: owl:Restriction (blank-node subClassOf)
  • functional: owl:FunctionalProperty, owl:InverseFunctionalProperty
  • disjoint: owl:disjointWith

So this can script can make "ECRM profiles", which use only a subset of OWL constructs.

Save Space in RS

RS-1279
In ResearchSpace we use ecrm-simplified.owl Why did we need this?
Because the ECRM owl:Restrictions add 20-25% more statements that are useless for us.
(I'd appreciate pointers to any system that uses them).

The CRM class hierarchy is quite deep, and each of these restrictions is propagated through it.

We have these counts of rdf:type statements for 115k British Museum objects:

_:nodeXX 34675445 41.2%
owl:Thing 5423480 6.4%
CRM classes 43325284 51.5%
crm:E1_CRM_Entity 5423480  
crm:E77_Persistent_Item 3869228  
crm:E70_Thing 3800112  
crm:E72_Legal_Object 3688546  
crm:E71_Man-Made_Thing 3681210  
crm:E28_Conceptual_Object 2932619  
...    

So we’ll save 34.6M statements. (Next I’ll be looking at a way to eliminate the useless owl:Thing statements).
In ResearchSpace we use “transitive” (with a few added statemetns that I believe were forgotten) but not restriction, functional or disjoint.

Avoid Ontological Overcommitment

Apart from the practical considerations above, ECRM makes "ontological overcommitments" beoynd the CRM standard.
E.g. it says (in Manchester syntax):

Class: ecrm:E72_Legal_Object
    SubClassOf:
        ecrm:P104_is_subject_to some ecrm:E30_Right,
        ecrm:P105_right_held_by some ecrm:E39_Actor,
        ecrm:E70_Thing

Which means: Legal_Object is not only a subclass of Thing (as per CRM standard), but there must be some Right and Actor that it is related to.
However, such objects may not be present in cases of missing information, so what good is that assertion?

RDFS+inverses

Last summer Martin Doerr asked if I can extend the CRM RDFS version to include inverses, being an innate part of CRM.
Such version is attached here (ecrm-inverses.owl), made from the ECRM version with my simplification script.
(The xmlns and xml:base at the beginning should be changed to the agreed CRM namespace.)

CRM Unification

The CRM community will benefit greatly from a single RDF definition of CRM.
This will not only eliminate effort duplication by the different maintainers, but will remove doubt in the community which version to use.
The unified spelling of property and class names adopted by the CRM SIG on Nov 20 is an important step in that direction

If the two groups want to collaborate, maybe the best approach is:

  • maintain the unified ontology in Protégé
  • export an OWL version from Protégé
  • use the simplification script to make “application profiles” as described above

Below are some next steps. Who can help?

  1. diff the two versions to find any discrepancies
    • This can be done with the Protégé plugin. I started doing it once, but dropped it after 10 min…
    • There is OWL Patch: http://owl.cs.manchester.ac.uk/patch/
      However, I tried it from the commandline and got this:
      > curl "http://owl.cs.manchester.ac.uk/patch/diff.php?a=http://cidoc-crm.org/rdfs/cidoc_crm_v5.0.4_official_release.rdfs&b=http://erlangen-crm.org/onto/ecrm/ecrm_current.owl"
      Sorry, can't diff!
      
  2. add an official namespace, something like http://cidoc-crm.org/ns or http://cidoc-crm.org/ns/
    ResearchSpace will be willing to adopt this namespace *if* agreed by Jan 15.
    After that date we’ll be loading 1.5M BM objects (estimated 2.076B statements)
  3. take the multilingual labels from RDFS.
    The ECRM guys prefer the prop/class numbers to be included in the labels, so this needs to be argued
  4. take the longer rdfs:comments from ECRM.
    Optionally, split into skos:scopeNote vs skos:example
  5. take skos:notation from my email 8-Aug-2012. I posted a script that generates statements like:
    P91i_is_unit_of rdfs:label "P91 is unit of"; skos:notation "P91i"
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.