View Source

{excerpt}Notes from May-Jul 2013{excerpt}

h1. Intro

Lec: (Our mapping) is same as Dominic's manual (to best of our understanding)
Vlado: Does it comply with BM's latest changes to modeling Association codes (esp re Acquisition, Production)? [BM Association Mapping v2]. Dominic's document probably reflects this, but these are recent changes and I haven't checked.

YCBA uses the following systems:
- BededWork = calendaring
- TMS = art collections
- Drupal = website, exhibitions etc
- Orbis = books etc

h2. Getting Yale On-board
Lec: If there is something else you need to get this to work with Research Space please let me know
Vlado: Once it's compliant, we should:
- try loading the data
- load your thesauri. Complete Getty or only the subset used by your objects? How about Broader terms?
- implement Image Annotation over your DeepZoom images (we also use IIP Image for RKD images, so shouldn't be too hard)
- coreference some of your terms to enable cross-collection search
- create RForms for your objects

See [RS Plan 3.7#Get Yale on-board with ResearchSpace] for details. As of 08-Jul-2013, this iteration is under planning and the exact scope and start is not clear. I think we can get Yale on board before mid-Sep (the Getty meeting), but it depends on the exact scope.

It appears that Yale is *not* the bottleneck: starting the RS3.7 iteration is. So please let's not rush this review process!

h2. Legend
Please don't use color or strikethrough, use the following symbols for easier tracking (easiest to edit them in Wiki Markup mode):
- (-) open issue
- (+) resolved issue
- (?) issue under discussion

h2. Eyeball
Lec: I am reviewing for any typos, missing types...

- (-) Have you tried Eyeball? See here: [RDF Validation and Conversion#Eyeball]
Lec: We tried Eyeball, no luck have to contact dev community as we were not able to install it after number of tries. TBD..

h1. Problems

h2. General
- (-) Pubby prefixes are not setup: shows "?:..."
Lec: does same for BM, will try to fix
- (+) STRONGLY Suggest to have 1 URI per object, not 3 sameAs URIs
Lec: BM has multiple, followed their lead
Vlado: RS currently cannot work with these sameAs (eg would return results in triplicate). BM puts sameAs in separate files that we don't load
- (-) don't emit prefixes you don't need: lccn, oclc, ycba_aat, etc etc
- (-) crm:PX_* (e.g. crm:PX_display_wrap) is wrong, should be bmo:PX_*
Lec: fixed with []
Vlado: Please use bmo:PX_* and not ycba:PX_*: don't create a second property with the same purpose.
- (-) Please use bmo:EX_Association and not ycba:EX_Association: don't define your own class for the same purpose.

h2. Connected Resources
TODO Vlado: write down appropriate representations

It's best to list all URLs closely related to each object, eg
"Mrs. Abington as Miss Prue in Love for Love by William Congreve":
- (-) VUFind record (i.e. "home page")
Note: the last part the CCD id []
- all [#Images], including Deep Zoom images
- LIDO record
Note: this link currently doesn't work
- ODAI (Yale Digital Collections Center) aggregation:
- Google Cultural Institute (Google Art Project)
Short: []
Long: []
- (/) Pubby page. I don't think we need it since

h1. Thesauri

h2. Whole Getty or Parts
Currently you emit thesaurus data together with the object, and only the terms used in the object:
- This way you miss Broader terms, so eg a search for "Animal" or "Mammal" won't find [FR Transitivity]
-- Lec: regarding Places, why don't we use Geographic Coordinates (included by Yale) and search by a bounding box?
-- Vlado: RS currently doesn't have bounding box search, because BM Places thesaurus doesn't include Geographic Coordinates.
RS has place name search, that uses the place hierarchy.
- This way you repeat data about the same term in many objects
- If you start emitting objects in separate graphs like BM does (to be able to easily replace/delete), the term data will be duplicated in each of these object graphs
- In several cases these in-object ad-hoc terms don't satisfy Thesaurus Requirements (see next section)

(-)(-) My strong recommendation is to export the complete Getty thesauri
- we shouldn't wait for Getty to do an official mapping, since it'll take a few months for TGN and ULAN, and it won't satisfy the requirement to publish as CRM (see next section)
- I can do this mapping. I'll also be involved in the Getty's mapping, so that's a good synergy
- Getty's committed to publish as LOD, so hopefully they won't object, as soon as we mark our export as Unofficial
- use a separate thesaurus export config, like BM does

I think this is the biggest outstanding Yale task.

h2. Thesaurus Requirements
You must comply with [BMX Issues#Thesaurus requirements]
Each term should be both a CRM entity of appropriate type, and skos:Concept.
- (-) You have this for some (eg Agents) but not others (eg Title Type).

h3. Meta-Thesaurus
- (-) Each thesaurus (ConceptScheme) used by Yale should be described in [Meta-Thesaurus and FR Names#YCBA Thesauri] (this section will be merged to the rest of the table)
-- This applies to both Getty and YCBA local thesauri
-- (!) TODO RS: extend RS to handle AAT, which is one ConceptScheme that includes a number of facets (hierarchies) for object type, material, technique, etc

Emmanuelle: It would be helpful to briefly go over the definitions for searchable and tagable.
- Searchable is a thesaurus that can be used in FR search. The list of FRs is [Meta-Thesaurus and FR Names#FR Names Table] and the detailed definitions are [FR Implementation]. Examples:
-- BM Object is searchable using FR2_has_type "is/has/about" because it's mapped to P2_has_type of the object
-- BM Ware and BM Currency are searchable using the same FR because they are sub-properties of P2_has_type
-- BM Aspect is *not* searchable because it's P2_has_type of E55_Type of a E25 Man-Made Feature on the object (side of coin)
-- If IPTC code is similar to "subject", it should be searchable
-- BM Unit and BM Dimension are *not* searchable because they are attributes of a Dimension of the object, and there's no FR defined for dimensions
-- BM Place is searchable, even in a hierarchical way
-- BM Place Type (town, village) and BM Place Name Type (modern, archaic) are *not* searchable because FRs don't reach into the properties of a place
- Taggable: whether the thesaurus is "interesting enough" to be used as a source of tags. Tags are general categories to be used for categorization of research questions and comments. See [Tags Spec]

h2. Term Distribution
- Yale: 99% of Yale terms come from TGN, AAT, and ULAN.
1% of terms come from ODNB, IconClass, YCBA Local terms (Frames, ...)
-- (-) Vlado: AAT surely has Frames?
- Yale: example of a lesser known person (Elihu Yale) who's found in ODNB, VIAF, DBPedia but not ULAN:
-- Vlado: such "local heros" are a typical pattern for any museum. BM People also has "local heros" that are not found in ULAN.
- Lec: will it be helpful if we make connections to ODNB, VIAF, DBPedia?
-- Vlado: yes, assuming you can easily export such term data according to [#Thesaurus Requirements]. If you source it from these external sources, you'd need to make the same SKOS & CRM mapping as for the rest, and register in the [#Meta-Thesaurus].
If these are indeed less than 1%, I'd source them from a single thesaurus YCBA Local.

There will be a meeting at Getty in September 2013, with 1/2 day discussion on Vocabularies

h2. Agents
- (-) here crm:E55_Type is wrong: a Group is *not* a Type
{noformat}<thesauri/nationality/British> a crm:E55_Type , crm:E74_Group , skos:Concept ;{noformat}
- (-) SKOS says one prefLabel (per language). If you don't have a flag in TMS, call the first one prefLabel and the rest altLabel
<person-institution/142> a crm:E21_Person , skos:Concept ;
skos:inScheme ycba:person-institution ;
skos:prefLabel "Robert Smirke I" , "Robert Smirke R. A." , "Robert Smirk" , "Robert I Smirke" , "Robert Smirke" ;
- (-) you don't have any date (P82_at_some_time_within) for <person-institution/142/birth/date>. This makes all the following statements useless, so kill them.
crm:P92i_was_brought_into_existence_by <person-institution/142/birth> ;
<person-institution/142/birth> a crm:E63_Beginning_of_Existence ;
crm:P4_has_time-span <person-institution/142/birth/date>
-- (-) Same for death
- (-) If you know whether it's a person or institution then use the respective specific subprop & subclass instead of the generic P92i, E63:
Person: P98i_was_born, E67_Birth
Group: P95i_was_formed_by, E66_Formation

h2. Thesaurus URIs
- (+) Use more logical URIs that reflect the nature of the resource or type, and don't reflect their genesis in existing systems:
<thesauri/event/exhibition_history> -> <thesauri/event/exhibition> (an exhibition is NOT "exhibition history")
<event/some-exhibition/TMS/exhibition_history> -> <event/some-exhibition/identifier> (an identifier is NOT "exhibition history")
<thesauri/identifier/TMS/exhibition_history> -> <thesauri/identifier/exhibition> (doesn't matter your system is called TMS)
Lec: this may need further discussion, we may have other types of events with IDs from other systems, however made changes per suggestion
Vlado: you have a point. If you have 2 exhibition IDs then you need to add the system acronym

h2. Exhibition URIs
- (-) We need to make decision on URI for exhibition, originally we had a short identifier, BM suggested title, this does not always work well, eg see: ObjectID 34
- Vlado: Yes, pretty long titles in [].
Exhibition :: An American's Passion for British Art - Paul Mellon's Legacy, 2007-2008
Exhibition :: Great British Paintings from American Collections: Holbein to Hockney, Thursday, September 27, 2001 - Sunday, December 30, 2001
Exhibition :: J. M. W. Turner - A Selection of Paintings from the Collection of Mr. and Mrs. Paul Mellon, 1968-1969
- RS doesn't care what the URI is

h3. Getty URIs
- Don't use YCBA-specific URIs for Getty, eg
This won't let your data mesh with other data using Getty.
- (+) Use the official namespace that Getty just decided (20-Jun-2013)
Lec: ULAN, AAT, TGN converted to Getty URIs
- (-) Vlado: The URI structure is still under discussion. My suggestion is:
Getty will have a board meeting Jul 15, and may decide the URL structure then. Leave it as is for now, but you'd probably have to change it one more time after they finalize
- same for the scheme: currently is
<thesauri/ULAN/500303557> a crm:E74_Group , skos:Concept ;
skos:inScheme <thesauri/institution> .
should be:

h1. Objects

h2. Titles
- Why do you need these duplicate types?
{noformat}crm:P2_has_type <thesaurus/title/Alternate-title> , <thesaurus/title/alternate> .{noformat}
- I'm not sure what "Repository title" is. But if it means Preferred, then this is also unnecessary duplication:
{noformat} crm:P2_has_type <thesaurus/title/Repository-title> , <thesaurus/title/preferred> .{noformat}
- these two titles are duplicated. Keep just one of them: I suggest <title/1> for uniformity with the alternate title(s)
<object/19850/title/1> a crm:E35_Title ;
rdfs:label "Malvolio Dancing" ;
crm:P2_has_type <thesaurus/title/Repository-title> , <thesaurus/title/preferred> .
<object/19850/title/primary> a crm:E35_Title ;
rdfs:label "Malvolio Dancing" ;
crm:P2_has_type <thesaurus/title/Repository-title> , <thesaurus/title/preferred> .

h2. Acquisition
- don't you know who gave up the object? Or I just hit an object that doesn't have this data?
<object/19850/acquisition> a crm:E10_Transfer_of_Custody , crm:E8_Acquisition ;
crm:P22_transferred_title_to <thesauri/ULAN/500303557> ;
crm:P29_custody_received_by <thesauri/ULAN/500303557> ;
rdfs:label "Yale Center for British Art, Paul Mellon Collection" .
- format the label as "Transferred to ..." (now it reads as an Agent, not as a Transfer)
- you state a shorter title about the agent in ULAN. Pick shorter or longer and use it consistently:
<thesauri/ULAN/500303557> a crm:E74_Group , skos:Concept ;
skos:prefLabel "Yale Center for British Art" ;

h2. Images
- This is wrong, see [image_objects_carriers@crmg]
{noformat}<object/7> P62_depicts <object/7/image/1>{noformat}
TODO Vlado: write the correct one
- this is wrong
<object/7/image/1> P108i_was_produced_by <object/7/image/1/creation>. # images are conceptual, so use P94i_was_created_by
<object/7/image/1/creation> P14_carried_out_by <object/thesauri/actor>; # by WHOM?! If you have no info, don't output Creation
rdf:type crm:E12_Production. # E12_Produciton is for material objects
-- Emmanuelle: I was trying to express the fact that the image is supplied/was made by YCBA, hence P108i_was_produced_by.
-- Vladimir: ok, but use the correct type and properties for Image (a conceptual object), and YCBA's URL:
<object/7/image/1> P94i_was_created_by <object/7/image/1/creation>.
<object/7/image/1/creation> P14_carried_out_by <thesauri/ULAN/500303557>;
rdf:type crm:E65_Creation.

h3. Image Kinds
RKD has the following image kinds (in separate thesauri):
- area captured: overall, detail, from left, from bottom...
- side captured: front, back
- object status: before treatment, during treatment, after treatment
- documentation type: X-ray film, action photograph, black and white detail photograph, black and white photograph, color transparency etc

(BM doesn't have such notions.)

Yale has similar notions. Our lovely Miss Prue [] has 8 images:
- cropped to image, recto, unframed
- recto, unframed
- framed, recto
- framed, verso
- detail, recto
- detail, recto
- Composite X-radiograph
- cropped to image, recto, unframed

Yale needs to represent this. For now I think it's enough to lump them in one thesaurus, eg:
PX_has_main_representation <>.
P2_has_type <yale/thes/image/kind/cropped_to_image>, <yale/thes/image/kind/recto>, <yale/thes/image/kind/unframed>.

h3. All Image Sizes
- (-) No need to repeat
P138i_has_representation <>
PX_has_main_representation <>
- (-) would be nice to record some metadata in RDF (eg MIME type, resolution)
[]: thumbnail JPG
[]: large JPG
[]: very large TIF (download, after captcha)
[]: broken link, is not an image, remove from RDF

h3. Deep Zoom images
TODO Vlado
- (-) We need them to implement Image Annotation of Yale paintings. There are many Deep Zoom images per object, eg
-- main: [,%201723-1792,%20British,%20Mrs.%20Abington%20as%20Miss%20Prue%20in%20&quot;Love%20for%20Love&quot;%20by%20William%20Congreve,%201771,%20Oil%20on%20canvas,%20Yale%20Center%20for%20British%20Art,%20Paul%20Mellon%20Collection]
-- X-Ray: [,%201723-1792,%20British,%20Mrs.%20Abington%20as%20Miss%20Prue%20in%20&quot;Love%20for%20Love&quot;%20by%20William%20Congreve,%201771,%20Oil%20on%20canvas,%20Yale%20Center%20for%20British%20Art,%20Paul%20Mellon%20Collection]
-- Note: right now I get "Error: No response from server", let's hope that's temporary

h3. Image Rights
- This says nothing (has no fields)
- this has the following problems:
<object/7/image/1> P70i_is_documented_in <object/7/image/1/terms_of_use>. # 1. should be P104_is_subject_to
<object/7/image/1/terms_of_use> # 2. it's not specific to this image, so don't use per-image node
rdfs:label ""; # 3. should be URI not string, 4. this redirects, so just use the final destination
rdf:type crm:E62_String. # 5. means nothing. So-called "CRM Primitive types" should not be used
-- 6. so simply use this:
<object/7/image/1> P104_is_subject_to <>.
<> a E30_Right; rdfs:label "Public Domain".
-- 7. better yet, use a CreativeCommons URI, since CC is a stronger authority about rights than YCBA

- Emmanuelle: I have some contextual information regarding my modeling for image rights that might help, since we are doing things a bit differently from the BM on this I believe.
-- Vladimir: indeed, BM claims rights (eg images\assets_0.trig):
crm:P138i_has_representation <>.
crm:P105_right_held_by thesIdentifier:the-british-museum.
- Emmanuelle: YCBA does not claim rights over images, just points to image use page, hence P70i_is_documented_in rather than P104_is_subject_to. YCBA does not say who owns the image rights.
-- Vladimir: that's fine: my example above (6) doesn't say YCBA claims any rights.
It just says the image P104_is_subject_to a Rights object, which allows unrestricted usage. See the scope note to be convinced this is the right class to use: "This class comprises legal *privileges* concerning material and immaterial things"

h2. Object Rights
- PX_has_copyright "Public Domain" is unnecessary since you have it structured as
If you want to output a string, use PX_display_wrap
- []
is a completely unnecessary intermediate node
- This should not be per-object
- so overall, use just this in object data:
{noformat}<object/7> P104_is_subject_to <>.{noformat}
And this in thesaurus data, not per-object:
<> a crm:E30_Right; rdfs:label "Public Domain".
- better yet, use a CreativeCommons URI, since CC is a stronger authority about rights than YCBA

h2. Acquisition
- (+) P30_transferred_custody_of is wrong direction
Lec: Replaced with P30i_custody_transferred_through

h2. Concepts
- useless intermediate node
-- just use
<object/20049> P129_is_about <>,
<>. # etc
- (-) if you can't find a term during mapping, report an error, don't export it as "-1"
Lec: Emmanuelle, please have some students go through TMS, I exclude now anything that has \-1
- (-) Lec: Emmanuelle there are cases where subjects are TGN where conceptID = 0, I will try to ignore. Example ObjectID = 34

h2. Production
When and why to use {nf} <obj/production/M/association> a ycba:EX_Association {nf}
That's defined by the specific sections in [BM Association Mapping v2]. The Intro section describes 3 patterns: code in Event, code in Subevent, code in Association, and the specific sections say which to use for which part of your data

- (-) No point using BOTH Subevents (parts) and EX_Association

h3. Produced By Specific Process
What do I do for the following crm:P14_carried_out_by association codes? Do they become types or labels?
These are all types of Production sub-events, because they pertain to the nature of the production process. See [BM Association Mapping v2#Produced By Specific Process] for the pattern:
- AR-Artist
- AU-author
- FB-finished by
- M-maker
- R-printer (printed by)
- PM-printmaker (print made by)
- Z-publisher (published by)
- TU-touched up by (print)

BTW: if you have just 1 code, it would be nice to optimize and not create sub-events, but BM doesn't do that

h3. Unknown Artist
Some records say "production performed by Unknown Artist".
- (?) Lec: removed unknown artists, this will need to be communicated with Emmanuelle, she has some reservations about it. In effort to get our data to work with RS I made the change.
- Vladimir's considerations:
-- RS is agnostic about this
-- CONS: If you say 10 paintings are made by Unknown Artist and use the same term, that's false because they may have been made by different people.
-- CONS: The sem web way of expressing that some info is missing is simply NOT to say anything.
-- PRO: if you want to search in RS for "paintings with Unknown creator", you can do it when mapped to an Unknown person or Unknown group. But you cannot currently search for "paintings that don't have creator info"
- Ken: Interesting conundrum about "Unknown" but it seems to be exactly as we know the world in traditional data. If one searches simply for Unknown, one does get everything Unknown, and we realize the works are not all by the same maker and that isn't a problem because we understand that Unknown is a class not an individual. Such a search however is usually combined with limiting qualifiers like "Unknown French 18th-century sculptor." Being unable to search in that fashion seems to me the much more limited option.
- Dominic: I am not sure that a person URI should refer to a generic 'unknown' which would be the same unknown person. Generally, if something is unknown it shouldn't have a triple. The absence of a triple means that it is not known. Producing triples that state a null doesn't seem useful. Is there a particular reason for specifically stating that an artist is unknown? A query for an object created in the 18th century, in the french style and that was produced by the technique of sculpting would return the result in the example. Further, if the person was unknown then you couldn't assume that s/he was French. In other words if there is a way to query to get the result required without a null then this seems preferable.
- Emmanuelle: I agree with Ken. 'Unknown creator' for an art historian carries meaningful information. It is not synonym with a null value (of course since all works of art have creators). Rather it means that the attribution research has not come up with conclusive information for now, and that situation can last for many years even centuries.
- Vladimir: You can express "French" and "Sculptor". Here's how the BM thesauri are modeled:
a crm:E21_Person, skos:Concept;
skos:inScheme id:person-institution;
skos:prefLabel "Alfonso Ruspagiari";
bmo:PX_gender <>;
bmo:PX_nationality <>;
bmo:PX_profession <>.
Nationality and Profession are modeled as Groups (Gender is not, it's merely a Type):
bmo:PX_gender rdfs:subPropertyOf crm:P2_has_type .
bmo:PX_nationality rdfs:subPropertyOf crm:P107i_is_current_or_former_member_of .
bmo:PX_profession rdfs:subPropertyOf crm:P107i_is_current_or_former_member_of .
In RS the search relation (FR) "created by" is transitive over P107i_is_current_or_former_member_of.
So if you search by "French" or "scupltor" you'll find objects created by the respective nationality or profession.
If the artist is unknown, you can still say that the E74_Group "French" and the E74_Group "scupltor" P14i_performed the Production, and the same searches will work.
- Vladimir: As for "18th century": RS cannot currently search by dates of the creator (life and flourit), but it can search by creation dates, which I think is "close enough".

h3. Closely Related Group
- Emmanuelle: I think it would make sense to model 'unknown artist' as a group because it is a denomination that actually contains many different entities (all unknown artists are not the same), and also because traditionally when we speak we create a short cut but we actually cannot be sure that 'unknown artist' is only one artist for a work of art.
-- The National Gallery has grappled with this as well and models 'Unknown Artist' as a sub-group of E74:
The production of a particular painting (E84) was carried out by an Unknown Artist (E21) who was known to be part of the EN41.Artist_Sub_Group of a known/individually defined Master (E21)
-- I have represented this logic in the attached graph. On the model of the NG I have also moved Circle of, Studio of and Workshop of to be Production association codes for E74_Group.
[^YCBA Production2 unknown creator.pdf]
{viewpdf:YCBA Production2 unknown creator.pdf}
- Vlado: this closely follows [BM Association Mapping v2#Production by Closely Related Group], and it's more faithful modeling than before. But it has implications for Search that need to be discussed with BM.
(?) Need to track the discussion at [BM Association Mapping Problems#Closely Related Group].
- Vlado: not all these codes are the same, and they map to *different CRM constructs*. I would group the NG codes as follows:
-- EN20-Studio of, EN23-Circle of, EN24-Workshop of, EN25-School of: [BM Association Mapping v2#Production by Closely Related Group]
-- EN44-Associate of, EN21-Follower of, EN22-Imitator of, EN42-After: [BM Association Mapping v2#Influenced By]
-- EN43-Attributed to: [BM Association Mapping v2#Probably/Unlikely Produced By]
- Some of these are subject to debate. Eg BM mapped two codes that sound similar to me onto different patterms:
AJ: Circle/School of: [BM Association Mapping v2#Production by Closely Related Group]
S: School of/style of [BM Association Mapping v2#Influenced By]

h2. Curatorial Comment
- (-) Yale: PX_curatorial_comment needs date and author added to the data model
- Vlado: this is clearly a case of EX_Association. It's is a subclass of E13_Attribute_Assignment, so see [attribute_assignment@crmg] and [recorder@crmg]:
<obj> bmo:PX_curatorial_comment "comment".
<obj/comment/1> a bmo:EX_Association;
P140_assigned_attribute_to <obj>; P141_assigned "comment"; bmo:PX_property bmo:PX_curatorial_comment;
P14_carried_out_by <researcher>;
P4_has_time-span <obj/comment/1/date>.
<obj/comment/1/date> P82_at_some_time_within "2013-07-06"^^xsd:gYear.
-- you could alternatively use P3_has_note instead of PX_curatorial_comment, to accommodate apps that know CRM and EX_Association but not PX_curatorial_comment. But I still think using the subproperty is better

h2. Inscriptions
I haven't seen any Inscription info in TTL. And in LIDO I see only empty XML tags for teh following types but without data:
<lido:inscriptions lido:type="Inscription">
<lido:inscriptions lido:type="Marks">
<lido:inscriptions lido:type="Lettering">
<lido:inscriptions lido:type="Signed and Dated">
Are there any objects with inscriptions for me to check?