{excerpt}Some Rembrandt data rework, related to harmonization and improvements{excerpt}
{toc}
h1. Way of working
- Vlado makes spec; changes to susana.ttl and diff
- Matthew makes changes to Migration
- Jana makes changes to RForm templates
- Mitac makes changes to EntityAPI (hopefully not many will be needed)
A lot of these are changes marked +WILL+ (then +DID+) in [Rembrandt Mapping Review]. Some were investigated in [RS-323@jira]
h2. Color Status
In the next section we use the color-coded \{status} macro to indicate where we are at.
When you're done with your part, please edit (in wiki mode!) and change "colour="
{status:colour=Gray|title=Gray}: not applicable
{status:colour=Red|title=Red}: not yet done
{status:colour=Yellow|title=Yellow}: just about done, needs someone else's attention
{status:colour=Green|title=Green}: done
I have not added a box for EntityAPI (Mitac's part) since hopefully these will be few
h1. Changes
h2. Remove part/1
BM data doesn't have parts. For harmonization and simplification:
- get rid of part/1, and put all its properties directly on the object.
I've been assured this won't constitute lying about its production, creator, material
- treat part/2 (the frame) as an accessory (less important) part.
Keep its URI as is, no need to change.
- has_number_of_parts: output 1 if there is frame; no property if there is no frame (*Matthew* please take note)
- get rid of rso:P46_has_main_part,
-- Keep rso:P46_has_other_part: needed by [properties.txt|#Update properties.txt].
-- We could compute this (as rso:P46_has_proper_part) from the standard property, then maybe use it to resolve the [FR BUG|FR Implemenatation|#BUG]:
{code}
x rso:P46_has_proper_part y := x crm:P46_is_composed_of y AND NOT (x rdf:type crm:E78_Collection)
{code}
{status:colour=Green|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. Update properties.txt
RS-92 uses a file of "business meaningful properties" to collect a complete Museum Object. Based on BM mapping and Rembrandt Changes, update this list
{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. {{rdfs:label}} vs {{crm:P3_has_note}}
Following Martin's recommendation, BM will use {{rdfs:label}} for the main label of every node, and {{crm:P3_has_note}} only for auxiliary notes. This is useful, since we may decide to skip their P3_has_note that duplicate structured data in a label, eg "Width :: 23.0"
However, this is not yet adopted by the CRM SIG, and it's unclear whether we're getting rid of P3_has_note altogether. So unless Jana says otherwise, we'll keep P3_has_note for Rembrandt
{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. Searchability and Display Fields
We cannot display search results including sub-objects (eg a Document or Related drawing that lacks most fields). That's why they should not be searchable, which we've accomplished by introducing E22_Museum_Object and marking only top-level objects with that class.
- Rethink where we find the Display fields, so we can display both Rembrandt and BM objects
- BM data doesn't include E22_Museum_Object
-- If possible we should formulate a different criterion for searchability, but I don't think CRM has such notion of "top-level or independent object"
-- If not, we should add such class to BM data
{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. Disentangle P2_has_type by introducing sub-properties
When two thesauri are mapped to P2_has_type, selecting New value in data annotation doesn't work since it cannot determine which thesaurus to use. Therefore sub-properties should be introduced. P2_has_type can be used as-is only if the node has a single "type".
We replace P2_has_type with the following:
- Object:
-- rso:P2_has_object_type (rkd-object)
-- rso:P2_has_object_shape (rkd-shape)
- Image: ([RS-627@jira] "Cannot determine thesaurus for image type")
-- rso:P129_has_iconclass (rst-iconclass)
-- rso:P129_has_keyword (rkd-keywords)
-- (!) TODO Vlado: add clause "P129_is_about E55_Type" to FR2_has_type, else we can't search by IconClass/Keywords
- Frame: rso:P2_has_object_type (rkd-object)
- File:
-- rso:P2_has_object_status (rkd-objectstatus)
-- rso:P2_has_area_captured (rkd-area_captured: FRONT/BACK and OVERALL/DETAIL).
Note: Jana, we cannot split rso:P2_has_area_captured to two separate fields for <file.spec.overall_detail> vs <file.spec.front_back>,
since if you look in thesauri-all.ttl, rkd-area_captured has many tangled values, eg "whole (front)". So we leave 2 instances of this property, and leave it to the user to put two "compatible" values in them.
{status:colour=Green|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h3. Single-out Object Type
We display the object type (painting, coin, etc) as one of the display fields ([RS-690]).
For this reason it needs to be singled out amongst all other E55_Type's. How can we do that?
# We could leave it as the only field per node mapped to P2_has_type.
But then to distinguish, two functions need to use *explicit* triples only (while of course, FRs use inferred triples):
#- "Fetch complete object data" for display
#- "Propose new value"
# We could map it to rso:P2_has_object_type (and a similar bmo: extension property), and filter by thesaurus (rkd-object and bm-thes-object respectively)
I like the second approach better since it doesn't rely on a particular query mode (and we may change that tomorrow).
Mitac & Jana, please give your opinion here.
(!) Jana, can you explain here *why* one property should not have multiple properties from several thesauri?
{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h3. Unchangeable P2_has_type
*Jana*, please note that the following P2_has_type properties are unchangeable (fixed).
They don't come from a thesaurus, i.e. don't have skos:inScheme, so a new value cannot be proposed.
{code}
<obj/2926/acquisition/1/price> crm:P2_has_type rst-currency: . # "Price"
<obj/2926/research/1> crm:P21_had_general_purpose rkd-res_type: . # "Research"
{code}
(This is for information only, no change)
h2. Business-specific sub-properties
Maria [RS-273@jira]: we planned initially that data in a record will be grouped into sections: Basics, Parts, Exhibitions, Auctions, Collections, etc.
But RForms cannot create different sections (lists) based on P2_has_type of a node: it can distinguish only based on relation.
- Make business-specific sub-properties
-- rso:P12i_was_present_at_exhibition (sub-property of crm:P12i_was_present_at, inverse rso:P12_exhibited)
-- rso:P12i_was_present_at_research (sub-property of crm:P12i_was_present_at, inverse rso:P12_researched)
-- rso:P24i_changed_ownership_through_auction (inverse rso:P12_auctioned)
- change P11 to P14 for auction house:
{code}<obj/2926/acquisition/1> crm:P14_carried_out_by <obj/2926/acquisition/1/house>.{code}
- (!) Maria/Jana to specify whether more sub-properties are needed
{status:colour=Yellow|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. Thesaurus changes
- add IconClass code as [SKOS Notation|http://www.w3.org/TR/skos-primer/#secnotations]:
{code}
rst-iconclass:_11H_JEROME_51 a crm:E55_Type, skos:Concept;
skos:notation "11 H (JEROME) 51"^^rst-iconclass:;
{code}
(currently will not be used by ResearchSpace but could be)
- Verify that Rembrandt and BM thesauri satisfy [BMX Issues#Thesaurus Requirements], and make appropriate changes
Rembrandt thesauri:
- (/) thesauri.ttl
- (/) thesauri-all.ttl
- (/) thesauri-disposition.ttl
- (/) thesauri-extracted.ttl
- thesauri-place.ttl
-- rkd-places: replace P89_falls_within with P88i_forms_part_of, else FR won't work (this is a bug)
{status:colour=Green|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Gray|title=rforms}
(x) BM thesauri: [RS-700@jira]
h1. Diffs
{attachments:patterns=.*patch}
- You can view these with TortoiseUDiff (part of TortoiseSVN)
- Below we show the most important parts as images, taken from the same program
- I personally prefer to view Tortoise>Diff with Araxis Merge, which shows word/char-level changes
Setup: rclick> TortoiseSVN> Settings> External Programs> Diff viewer> External
{noformat}"C:\Program Files (x86)\Araxis Merge v6.5\compare.exe" /max /wait /title1:%bname /title2:%yname %base %mine{noformat}
!susana.ttl.patch.araxis.png!
h2. rso.ttl
!rso.ttl.patch.png!
h2. susana.ttl
!susana.ttl.patch-1.png!
!susana.ttl.patch-2.png!
!susana.ttl.patch-3.png!
!susana.ttl.patch-4.png!
!susana.ttl.patch-5.png!
h2. thesauri.ttl
!thesauri.ttl.patch.png!
etc
h2. thesauri-place.ttl
!thesauri-place.ttl.patch.png!
etc
{toc}
h1. Way of working
- Vlado makes spec; changes to susana.ttl and diff
- Matthew makes changes to Migration
- Jana makes changes to RForm templates
- Mitac makes changes to EntityAPI (hopefully not many will be needed)
A lot of these are changes marked +WILL+ (then +DID+) in [Rembrandt Mapping Review]. Some were investigated in [RS-323@jira]
h2. Color Status
In the next section we use the color-coded \{status} macro to indicate where we are at.
When you're done with your part, please edit (in wiki mode!) and change "colour="
{status:colour=Gray|title=Gray}: not applicable
{status:colour=Red|title=Red}: not yet done
{status:colour=Yellow|title=Yellow}: just about done, needs someone else's attention
{status:colour=Green|title=Green}: done
I have not added a box for EntityAPI (Mitac's part) since hopefully these will be few
h1. Changes
h2. Remove part/1
BM data doesn't have parts. For harmonization and simplification:
- get rid of part/1, and put all its properties directly on the object.
I've been assured this won't constitute lying about its production, creator, material
- treat part/2 (the frame) as an accessory (less important) part.
Keep its URI as is, no need to change.
- has_number_of_parts: output 1 if there is frame; no property if there is no frame (*Matthew* please take note)
- get rid of rso:P46_has_main_part,
-- Keep rso:P46_has_other_part: needed by [properties.txt|#Update properties.txt].
-- We could compute this (as rso:P46_has_proper_part) from the standard property, then maybe use it to resolve the [FR BUG|FR Implemenatation|#BUG]:
{code}
x rso:P46_has_proper_part y := x crm:P46_is_composed_of y AND NOT (x rdf:type crm:E78_Collection)
{code}
{status:colour=Green|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. Update properties.txt
RS-92 uses a file of "business meaningful properties" to collect a complete Museum Object. Based on BM mapping and Rembrandt Changes, update this list
{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. {{rdfs:label}} vs {{crm:P3_has_note}}
Following Martin's recommendation, BM will use {{rdfs:label}} for the main label of every node, and {{crm:P3_has_note}} only for auxiliary notes. This is useful, since we may decide to skip their P3_has_note that duplicate structured data in a label, eg "Width :: 23.0"
However, this is not yet adopted by the CRM SIG, and it's unclear whether we're getting rid of P3_has_note altogether. So unless Jana says otherwise, we'll keep P3_has_note for Rembrandt
{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. Searchability and Display Fields
We cannot display search results including sub-objects (eg a Document or Related drawing that lacks most fields). That's why they should not be searchable, which we've accomplished by introducing E22_Museum_Object and marking only top-level objects with that class.
- Rethink where we find the Display fields, so we can display both Rembrandt and BM objects
- BM data doesn't include E22_Museum_Object
-- If possible we should formulate a different criterion for searchability, but I don't think CRM has such notion of "top-level or independent object"
-- If not, we should add such class to BM data
{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. Disentangle P2_has_type by introducing sub-properties
When two thesauri are mapped to P2_has_type, selecting New value in data annotation doesn't work since it cannot determine which thesaurus to use. Therefore sub-properties should be introduced. P2_has_type can be used as-is only if the node has a single "type".
We replace P2_has_type with the following:
- Object:
-- rso:P2_has_object_type (rkd-object)
-- rso:P2_has_object_shape (rkd-shape)
- Image: ([RS-627@jira] "Cannot determine thesaurus for image type")
-- rso:P129_has_iconclass (rst-iconclass)
-- rso:P129_has_keyword (rkd-keywords)
-- (!) TODO Vlado: add clause "P129_is_about E55_Type" to FR2_has_type, else we can't search by IconClass/Keywords
- Frame: rso:P2_has_object_type (rkd-object)
- File:
-- rso:P2_has_object_status (rkd-objectstatus)
-- rso:P2_has_area_captured (rkd-area_captured: FRONT/BACK and OVERALL/DETAIL).
Note: Jana, we cannot split rso:P2_has_area_captured to two separate fields for <file.spec.overall_detail> vs <file.spec.front_back>,
since if you look in thesauri-all.ttl, rkd-area_captured has many tangled values, eg "whole (front)". So we leave 2 instances of this property, and leave it to the user to put two "compatible" values in them.
{status:colour=Green|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h3. Single-out Object Type
We display the object type (painting, coin, etc) as one of the display fields ([RS-690]).
For this reason it needs to be singled out amongst all other E55_Type's. How can we do that?
# We could leave it as the only field per node mapped to P2_has_type.
But then to distinguish, two functions need to use *explicit* triples only (while of course, FRs use inferred triples):
#- "Fetch complete object data" for display
#- "Propose new value"
# We could map it to rso:P2_has_object_type (and a similar bmo: extension property), and filter by thesaurus (rkd-object and bm-thes-object respectively)
I like the second approach better since it doesn't rely on a particular query mode (and we may change that tomorrow).
Mitac & Jana, please give your opinion here.
(!) Jana, can you explain here *why* one property should not have multiple properties from several thesauri?
{status:colour=Red|title=spec}{status:colour=Red|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h3. Unchangeable P2_has_type
*Jana*, please note that the following P2_has_type properties are unchangeable (fixed).
They don't come from a thesaurus, i.e. don't have skos:inScheme, so a new value cannot be proposed.
{code}
<obj/2926/acquisition/1/price> crm:P2_has_type rst-currency: . # "Price"
<obj/2926/research/1> crm:P21_had_general_purpose rkd-res_type: . # "Research"
{code}
(This is for information only, no change)
h2. Business-specific sub-properties
Maria [RS-273@jira]: we planned initially that data in a record will be grouped into sections: Basics, Parts, Exhibitions, Auctions, Collections, etc.
But RForms cannot create different sections (lists) based on P2_has_type of a node: it can distinguish only based on relation.
- Make business-specific sub-properties
-- rso:P12i_was_present_at_exhibition (sub-property of crm:P12i_was_present_at, inverse rso:P12_exhibited)
-- rso:P12i_was_present_at_research (sub-property of crm:P12i_was_present_at, inverse rso:P12_researched)
-- rso:P24i_changed_ownership_through_auction (inverse rso:P12_auctioned)
- change P11 to P14 for auction house:
{code}<obj/2926/acquisition/1> crm:P14_carried_out_by <obj/2926/acquisition/1/house>.{code}
- (!) Maria/Jana to specify whether more sub-properties are needed
{status:colour=Yellow|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Red|title=rforms}
h2. Thesaurus changes
- add IconClass code as [SKOS Notation|http://www.w3.org/TR/skos-primer/#secnotations]:
{code}
rst-iconclass:_11H_JEROME_51 a crm:E55_Type, skos:Concept;
skos:notation "11 H (JEROME) 51"^^rst-iconclass:;
{code}
(currently will not be used by ResearchSpace but could be)
- Verify that Rembrandt and BM thesauri satisfy [BMX Issues#Thesaurus Requirements], and make appropriate changes
Rembrandt thesauri:
- (/) thesauri.ttl
- (/) thesauri-all.ttl
- (/) thesauri-disposition.ttl
- (/) thesauri-extracted.ttl
- thesauri-place.ttl
-- rkd-places: replace P89_falls_within with P88i_forms_part_of, else FR won't work (this is a bug)
{status:colour=Green|title=spec}{status:colour=Green|title=diff}{status:colour=Red|title=mig}{status:colour=Gray|title=rforms}
(x) BM thesauri: [RS-700@jira]
h1. Diffs
{attachments:patterns=.*patch}
- You can view these with TortoiseUDiff (part of TortoiseSVN)
- Below we show the most important parts as images, taken from the same program
- I personally prefer to view Tortoise>Diff with Araxis Merge, which shows word/char-level changes
Setup: rclick> TortoiseSVN> Settings> External Programs> Diff viewer> External
{noformat}"C:\Program Files (x86)\Araxis Merge v6.5\compare.exe" /max /wait /title1:%bname /title2:%yname %base %mine{noformat}
!susana.ttl.patch.araxis.png!
h2. rso.ttl
!rso.ttl.patch.png!
h2. susana.ttl
!susana.ttl.patch-1.png!
!susana.ttl.patch-2.png!
!susana.ttl.patch-3.png!
!susana.ttl.patch-4.png!
!susana.ttl.patch-5.png!
h2. thesauri.ttl
!thesauri.ttl.patch.png!
etc
h2. thesauri-place.ttl
!thesauri-place.ttl.patch.png!
etc