View Source

{excerpt}Some Rembrandt data rework, related to harmonization and improvements{excerpt}
{toc}

h1. Way of working:
- Vlado makes changes to susana.ttl and diff
- Matthew makes changes to Migration
- Jana makes changes to RForm templates
- Mitac makes changes to EntityAPI (hopefully not many will be needed)

A lot of these are changes marked +WILL+ in [Rembrandt Mapping Review]. Some was investigated in [RS-323@jira]

h1. Changes

h2. Remove part/1
BM data doesn't have parts. For harmonization and simplification:
- get rid of part/1, and put all its properties directly on the object.
I've been assured this won't constitute lying about its production, creator, material
- treat part/2 (the frame) as an accessory (less important) part.
Keep its URI as is, no need to change.
- has_number_of_parts: output 1 if there is frame; no property if there is no frame (*Matthew* please take note)
- get rid of rso:P46_has_main_part,
-- Keep rso:P46_has_other_part: needed by [properties.txt|#Update properties.txt].
-- We could compute this (as rso:P46_has_proper_part) from the standard property, then maybe use it to resolve the [FR BUG|FR Implemenatation|#BUG]:
{code}
x rso:P46_has_proper_part y := x crm:P46_is_composed_of y AND NOT (x rdf:type crm:E78_Collection)
{code}

{status:colour=Green|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}

h2. Update properties.txt
RS-92 uses a file of "business meaningful properties" to collect a complete Museum Object. Based on BM mapping and Rembrandt Changes, update this list

{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}

h2. {{rdfs:label}} vs {{crm:P3_has_note}}
Following Martin's recommendation, BM will use {{rdfs:label}} for the main label of every node, and {{crm:P3_has_note}} only for auxiliary notes. This is useful, since we may decide to skip their P3_has_note that duplicate structured data in a label, eg "Width :: 23.0"

However, this is not yet adopted by the CRM SIG, and it's unclear whether we're getting rid of P3_has_note altogether. So unless Jana says otherwise, we'll keep P3_has_note for Rembrandt

{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}

h2. Searchability and Display Fields
We cannot display search results including sub-objects (eg a Document or Related drawing that lacks most fields). That's why they should not be searchable, which we've accomplished by introducing E22_Museum_Object and marking only top-level objects with that class.
- Rethink where we find the Display fields, so we can display both Rembrandt and BM objects
- BM data doesn't include E22_Museum_Object
-- If possible we should formulate a different criterion for searchability, but I don't think CRM has such notion of "top-level or independent object"
-- If not, we should add such class to BM data

{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}

h2. Disentangle P2_has_type by introducing sub-properties
When two thesauri are mapped to P2_has_type, selecting New value in data annotation doesn't work since it cannot determine which thesaurus to use. Therefore sub-properties should be introduced. P2_has_type can be used as-is only if the node has a single "type".
We replace P2_has_type with the following:
- Object:
-- rso:P2_has_object_type (rkd-object)
-- rso:P2_has_object_shape (rkd-shape)
- Image: ([RS-627@jira] "Cannot determine thesaurus for image type")
-- rso:P129_has_iconclass (rst-iconclass)
-- rso:P129_has_keyword (rkd-keywords)
-- (!) TODO Vlado: add clause "P129_is_about E55_Type" to FR2_has_type, else we can't search by IconClass/Keywords
- Frame: rso:P2_has_object_type (rkd-object)
- File:
-- rso:P2_has_object_status (rkd-objectstatus)
-- rso:P2_has_area_captured (rkd-area_captured: FRONT/BACK and OVERALL/DETAIL).
Note: Jana, we cannot split rso:P2_has_area_captured to two separate fields for <file.spec.overall_detail> vs <file.spec.front_back>,
since if you look in thesauri-all.ttl, rkd-area_captured has many tangled values, eg "whole (front)". So we leave 2 instances of this property, and leave it to the user to put two "compatible" values in them.

{status:colour=Green|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}

h3. Unchangeable P2_has_type
*Jana*, please note that the following P2_has_type properties are unchangeable (fixed).
They don't come from a thesaurus, i.e. don't have skos:inScheme, so a new value cannot be proposed.
{code}
<obj/2926/acquisition/1/price> crm:P2_has_type rst-currency: . # "Price"
<obj/2926/research/1> crm:P21_had_general_purpose rkd-res_type: . # "Research"
{code}

h2. Business-specific sub-properties
Maria [RS-273@jira]: we planned initially that data in a record will be grouped into sections: Basics, Parts, Exhibitions, Auctions, Collections, etc.
But RForms cannot create different sections (lists) based on P2_has_type of a node: it can distinguish only based on relation.
- Make business-specific sub-properties
-- rso:P12i_was_present_at_exhibition (sub-property of crm:P12i_was_present_at, inverse rso:P12_exhibited)
-- rso:P12i_was_present_at_research (sub-property of crm:P12i_was_present_at, inverse rso:P12_researched)
-- rso:P24i_changed_ownership_through_auction (inverse rso:P12_auctioned)
- change P11 to P14 for auction house:
{code}<obj/2926/acquisition/1> crm:P14_carried_out_by <obj/2926/acquisition/1/house>.{code}
- (!) Maria/Jana to specify whether more sub-properties are needed

{status:colour=Yellow|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}

h2. Thesaurus changes
- add IconClass code as [SKOS Notation|http://www.w3.org/TR/skos-primer/#secnotations]:
{code}
rst-iconclass:_11H_JEROME_51 a crm:E55_Type, skos:Concept;
skos:notation "11 H (JEROME) 51"^^rst-iconclass:;
{code}
(currently will not be used by ResearchSpace but could be)
- Verify that Rembrandt and BM thesauri satisfy [BMX Issues#Thesaurus Requirements], and make appropriate changes

Rembrandt thesauri:
- (/) thesauri.ttl
- (/) thesauri-all.ttl
- (/) thesauri-disposition.ttl
- (/) thesauri-extracted.ttl
- thesauri-place.ttl
-- rkd-places: replace P89_falls_within with P88i_forms_part_of, else FR won't work (this is a bug)

{status:colour=Green|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}

(x) BM thesauri: [RS-700@jira]

h1. Diffs
{attachments:patterns=.*patch}
- You can view these with TortoiseUDiff (part of TortoiseSVN)
- Below we show the most important parts as images, taken from the same program
- I personally prefer to view Tortoise>Diff with Araxis Merge, which shows word/char-level changes
Setup: rclick> TortoiseSVN> Settings> External Programs> Diff viewer> External
{noformat}"C:\Program Files (x86)\Araxis Merge v6.5\compare.exe" /max /wait /title1:%bname /title2:%yname %base %mine{noformat}

!susana.ttl.patch.araxis.png!

h2. rso.ttl
!rso.ttl.patch.png!

h2. susana.ttl
!susana.ttl.patch-1.png!
!susana.ttl.patch-2.png!
!susana.ttl.patch-3.png!
!susana.ttl.patch-4.png!
!susana.ttl.patch-5.png!

h2. thesauri.ttl
!thesauri.ttl.patch.png!
etc

h2. thesauri-place.ttl
!thesauri-place.ttl.patch.png!
etc