compared with
Current by Vladimir Alexiev
on Feb 12, 2013 10:52.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (13)

View Page History
{excerpt}BM extensions to CRM{excerpt}
Developed by Seme4 as part of BM collection data mapping to RDF in the CRM schema
{excerpt}BM extensions to CRM. This is obsolete: see [BM Mapping]{excerpt}
{toc}
{attachments}

h1. BMX Description
(i) [^D2 - Commentary on the mapping process- from BM to CIDOC-CRM.pdf], Hugh Glaser, SotonU
- very informative description of the mapping, shows the complexity of museum data.
- detailed description of the developed CRM extension ontology (BMX).
- [^bm-extensions.ttl]: BMX ontology definition. It's RDFS only, not OWL
- [http://crm.rkbexplorer.com]: live links to CRM & BMX classes

h1. BM Data
- 14Gb of XMLs exported from BM's collection system (Merlin). Took several weeks (SSL says it'd have been faster if they knew better)
- RDF takes between 12 and 24 hours to assert in the current triple store (IntelliDimension)
- Exported from Merlin, a C/C+\+ system with a proprietary OO database. BM's Merlin model is based on the SPECTRUM standard for Museum Documentation. The functionality of other museum collection systems is also based on SPECTRUM
- [http://collection.britishmuseum.org/]: live links to BM objects
(was [http://bm2.rkbexplorer.com] during development)
- instructive usage guide for the CRM

[^D1 - BM to ResearchSpace- Conversion Process and Schema.pdf], Ian Millard, Seme4. "Recommended process for converting data from a Collections Management System in the Cultural Heritage domain into Open Linked Data"
- demonstrate feasibility for converting the entire Collections data from the British Museum into Linked Data
- Describes the developed ETL tool "makeRDF": based on config files, uses xpath, similar to xslt but simpler
- nice intro to CRM
- recommended URI allocation:
[http://collection.britishmuseum.org/object/PPA1818832] : artifact object (non information resource)
[http://collection.britishmuseum.org/object/PPA1818832.html] : html representation (information resource for humans)
[http://collection.britishmuseum.org/object/PPA1818832.rdf] : rdf representation (information resource for computers)
[http://collection.britishmuseum.org/object/PPA1818832/production] : info about production (who when where created it)
etc
- 14Gb of XMLs exported from BM's collection system (Merlin). Took several weeks (SSL says it'd have been faster if they knew better)
- Merlin is a C/C+\+ system with a proprietary OO database
- BM has 2M objects, 560k of them have images. The CRM RDF has maybe 100 triples per object
- RDF takes between 12 and 24 hours to assert depending on the triple store employed
- BM's Merlin model is based on the SPECTRUM standard for Museum Documentation. The functionality of other museum collection systems is also based on SPECTRUM
[http://collection.britishmuseum.org/object/PPA1818832/acquisition] : info about acquisition (when the BM got it)

h1. Need for CRM extensions
- CRM is a very rich ontology, but still not rich enough to capture collection data from different institutions
- BMX shows that extensions are needed
- the BM should drive towards unification and vigorously promote such extensions for CRM standardization, rather than relying on thesaurus mapping (mapping profiles) and co-reference, which make for more difficult reasoning
h1. BMX Description
Developed by Seme4 as part of BM collection data mapping to RDF in the CRM schema.
Extended/corrected by Josh following review by Vlado.
The following documents describe the BM data

h1. BMX problems
BMX itself can be improved. Some examples (mostly from the area of measurements, eg width, height):
- Rather than using BM-specific URIs for units, it'd be better to map during conversion to ontologies that are more established in this area: NASA QUDT (Quantities, Units, and Dimensions) or OASIS UoM (Units of Measure)
- CRM uses a single number (P90_has_value is E60_Number) while BMX extends this to an interval (PX.min_value and PX.max_value)
-- Because min/max are declared as sub-properties of P90_has_value, you end up with P90_has_value having two values
-- I think it's better to compute P90_has_value during conversion, e.g. to be the average of min and max (or equal to min if max is undefined)
-- See [Imprecise Begin-End (and general)] for further discussion
- CRM says that extensions should define properties as either sub-properties, or "long-cut" paths of existing CRM properties.
But BMX defines various properties that are neither sub-properties, nor long-cuts.
h2. D1 Conversion
[^D1 - BM to ResearchSpace- Conversion Process and Schema.pdf]. Ian Millard, Seme4.
"Recommended process for converting data from a Collections Management System in the Cultural Heritage domain into Open Linked Data"
- demonstrate feasibility for converting the entire Collections data from the British Museum into Linked Data
- describes the developed ETL tool "makeRDF": based on config files, uses xpath, similar to xslt but simpler
- describes the technical aspects of data conversion
- nice intro to CRM

h2. D2 Mapping
(i) [^D2 - Commentary on the mapping process- from BM to CIDOC-CRM.pdf], Hugh Glaser, SotonU
- very informative description of the mapping and the thinking behind it, shows the complexity of museum data.
- description of BMX
- instructive usage guide for the CRM

h2. Sample object
[^BM-object-GAA87981.pdf]: Shows all triples for a single object, and describes them

h2. BMX ontology
- [http://crm.rkbexplorer.com]: live links to CRM & BMX class & property desciptions
- [^bm-extensions.ttl]: BMX definition. It's RDFS only, not OWL
- [^BM-extensions.pdf]: BMX description