
Notes from May-Jul 2013
RS-1874
- Intro
- Problems
- Thesauri
- Objects
Intro
Lec: (Our mapping) is same as Dominic's manual (to best of our understanding)
Vlado: Does it comply with BM's latest changes to modeling Association codes (esp re Acquisition, Production)? BM Association Mapping v2. Dominic's document probably reflects this, but these are recent changes and I haven't checked.
YCBA uses the following systems:
- BededWork = calendaring
- TMS = art collections
- Drupal = website, exhibitions etc
- Orbis = books etc
Getting Yale On-board
Lec: If there is something else you need to get this to work with Research Space please let me know
Vlado: Once it's compliant, we should:
- try loading the data
- load your thesauri. Complete Getty or only the subset used by your objects? How about Broader terms?
- implement Image Annotation over your DeepZoom images (we also use IIP Image for RKD images, so shouldn't be too hard)
- coreference some of your terms to enable cross-collection search
- create RForms for your objects
See RS Plan 3.7#Get Yale on-board with ResearchSpace for details. As of 08-Jul-2013, this iteration is under planning and the exact scope and start is not clear. I think we can get Yale on board before mid-Sep (the Getty meeting), but it depends on the exact scope.
It appears that Yale is not the bottleneck: starting the RS3.7 iteration is. So please let's not rush this review process!
Legend
Please don't use color or strikethrough, use the following symbols for easier tracking (easiest to edit them in Wiki Markup mode):
open issue
resolved issue
issue under discussion
Eyeball
Lec: I am reviewing for any typos, missing types...
Have you tried Eyeball? See here: RDF Validation and Conversion#Eyeball
RS-1071
Lec: We tried Eyeball, no luck have to contact dev community as we were not able to install it after number of tries. TBD..
Problems
General
GREY Pubby prefixes are not setup: shows "?:..."
Lec: low priority, at first glance did not see where the config has this, will need to return to it after we are done with everything or in spare time.STRONGLY Suggest to have 1 URI per object, not 3 sameAs URIs
GREY Lec: BM has multiple, followed their lead - we still need to figure this out (don't want to publish two data sets one for RS and one for the world) - not pressing low priority
Vlado: RS currently cannot work with these sameAs (eg would return results in triplicate). BM puts sameAs in separate files that we don't loaddon't emit prefixes you don't need: lccn, oclc, ycba_aat, etc etc
crm:PX_* (e.g. crm:PX_display_wrap) is wrong, should be bmo:PX_*
Lec: fixed with http://collection.britishart.yale.edu/id/ontology/PX_display_wrap
Vlado: Please use bmo:PX_* and not ycba:PX_*: don't create a second property with the same purpose.Same holds about classes: use bmo:EX_Association not ycba:EX_Association: don't define your own class for the same purpose.
Connected Resources
TODO Vlado: write down appropriate representations
It's best to list all URLs closely related to each object, eg
http://collection.britishart.yale.edu/id/object/5005
"Mrs. Abington as Miss Prue in Love for Love by William Congreve":
VUFind record (i.e. "home page")
http://collections.britishart.yale.edu/vufind/Record/1669236
Note: the last part the CCD id http://collection.britishart.yale.edu/id/object/5005/ccd- all Images, including Deep Zoom images
- LIDO recordhttp://collections.britishart.yale.edu/oaicatmuseum/OAIHandler?verb=GetRecord&identifier=oai:tms.ycba.yale.edu:7&metadataPrefix=lido
Note: this link currently doesn't work - ODAI (Yale Digital Collections Center) aggregation:
http://discover.odai.yale.edu/ydc/Record/1669236 - Google Cultural Institute (Google Art Project)
Short: http://www.google.com/culturalinstitute/asset-viewer/tQHBb0Q2MZF2uQ
Long: http://www.google.com/culturalinstitute/asset-viewer/mrs-abington-as-miss-prue-in-love-for-love-by-william-congreve/tQHBb0Q2MZF2uQ Pubby page. I don't think we need it since
http://collection.britishart.yale.edu/id/page/object/5005
Thesauri
Whole Getty or Parts
GREYCurrently you emit thesaurus data together with the object, and only the terms used in the object:
- This way you miss Broader terms, so eg a search for "Animal" or "Mammal" won't find FR Transitivity
- Lec: regarding Places, why don't we use Geographic Coordinates (included by Yale) and search by a bounding box?
- Vlado: RS currently doesn't have bounding box search, because BM Places thesaurus doesn't include Geographic Coordinates.
RS has place name search, that uses the place hierarchy.
- This way you repeat data about the same term in many objects
- If you start emitting objects in separate graphs like BM does (to be able to easily replace/delete), the term data will be duplicated in each of these object graphs
As you can see already happens about YCBA itself:<http://vocab.getty.edu/resource/ulan/subject/500303557> a crm:E74_Group , skos:Concept ; skos:prefLabel "Yale Center for British Art" ; skos:inScheme <http://collection.britishart.yale.edu/id/thesauri/institution> ; a crm:E74_Group , skos:Concept ; skos:prefLabel "Yale Center for British Art" ; skos:inScheme <http://collection.britishart.yale.edu/id/thesauri/institution> .
- In several cases these in-object ad-hoc terms don't satisfy Thesaurus Requirements (see next section)
My strong recommendation is to export the complete Getty thesauri
- we shouldn't wait for Getty to do an official mapping, since it'll take a few months for TGN and ULAN, and it won't satisfy the requirement to publish as CRM (see next section)
- I can do this mapping. I'll also be involved in the Getty's mapping, so that's a good synergy
- Getty's committed to publish as LOD, so hopefully they won't object, as soon as we mark our export as Unofficial
- use a separate thesaurus export config, like BM does
Lec: I export terms within objects because they are present in the LIDO XML.
But to export the complete tehsauri, I would need to get someone to export from RDBMS, and we cannot do this right now
Decision:
- For the time being we'll stay without Broader terms
- Dominic and Vlado to try to expedite the Getty export by Getty
Thesaurus Requirements
You must comply with BMX Issues#Thesaurus requirements
Each term should be both a CRM entity of appropriate type, and skos:Concept.
You have this for some (eg Agents) but not others (eg Title Type).
Meta-Thesaurus
Each thesaurus (ConceptScheme) used by Yale should be described in Meta-Thesaurus and FR Names#YCBA Thesauri (this section will be merged to the rest of the table)
- This applies to both Getty and YCBA local thesauri.
- GREY extend RS to handle AAT as one ConceptScheme that includes a number of facets (hierarchies) for object type, material, technique, etc
- For now: Yale will export AAT terms in different concept schemes, eg:
aat:123456 skos:inScheme aat_object: . # painting aat:678901 skos:inScheme aat_material: . # canvas aat:234567 skos:inScheme aat_technique: . # oil aat:890123 skos:inScheme aat_subject: . # horse
- TODO Vlado: use actual AAT identifiers above, and use the AAT Facets (top-levels) as concept schemes
Emmanuelle: It would be helpful to briefly go over the definitions for searchable and tagable.
- Searchable is a thesaurus that can be used in FR search. The list of FRs is Meta-Thesaurus and FR Names#FR Names Table and the detailed definitions are FR Implementation. Examples:
- BM Object is searchable using FR2_has_type "is/has/about" because it's mapped to P2_has_type of the object
- BM Ware and BM Currency are searchable using the same FR because they are sub-properties of P2_has_type
- BM Aspect is not searchable because it's P2_has_type of E55_Type of a E25 Man-Made Feature on the object (side of coin)
- If IPTC code is similar to "subject", it should be searchable
- BM Unit and BM Dimension are not searchable because they are attributes of a Dimension of the object, and there's no FR defined for dimensions
- BM Place is searchable, even in a hierarchical way
- BM Place Type (town, village) and BM Place Name Type (modern, archaic) are not searchable because FRs don't reach into the properties of a place
- Taggable: whether the thesaurus is "interesting enough" to be used as a source of tags. Tags are general categories to be used for categorization of research questions and comments. See Tags Spec
Term Distribution
- Yale: 99% of Yale terms come from TGN, AAT, and ULAN.
1% of terms come from ODNB, IconClass, YCBA Local terms (Frames, ...)Vlado: AAT surely has Frames?
- Yale: example of a lesser known person (Elihu Yale) who's found in ODNB, VIAF, DBPedia but not ULAN:http://www.oxforddnb.com/view/article/30183http://viaf.org/viaf/46310522/http://dbpedia.org/page/Elihu_Yale
- Vlado: such "local heros" are a typical pattern for any museum. BM People also has "local heros" that are not found in ULAN.
- Lec: will it be helpful if we make connections to ODNB, VIAF, DBPedia?
- Vlado: yes, assuming you can easily export such term data according to Thesaurus Requirements. If you source it from these external sources, you'd need to make the same SKOS & CRM mapping as for the rest, and register in the Meta-Thesaurus.
If these are indeed less than 1%, I'd source them from a single thesaurus YCBA Local.
- Vlado: yes, assuming you can easily export such term data according to Thesaurus Requirements. If you source it from these external sources, you'd need to make the same SKOS & CRM mapping as for the rest, and register in the Meta-Thesaurus.
There will be a meeting at Getty in September 2013, with 1/2 day discussion on Vocabularies
Agents
remove crm:E55_Type: a Group is not a Type
<thesauri/nationality/British> a crm:E55_Type , crm:E74_Group , skos:Concept ;
SKOS says one prefLabel (per language). If you don't have a flag in TMS, call the first one prefLabel and the rest altLabel
<person-institution/142> a crm:E21_Person , skos:Concept ; skos:inScheme ycba:person-institution ; skos:prefLabel "Robert Smirke I" , "Robert Smirke R. A." , "Robert Smirk" , "Robert I Smirke" , "Robert Smirke" ;
Lec: Awaiting Emmanuelle confirmation if subjectActor will have multiple names, currently not in LIDO
Emmanuelle: yes a fair number of our subjectActor have alternate names in addition to their preferred names.you don't have any date (P82_at_some_time_within) for <person-institution/142/birth/date>. This makes all the following statements useless, so kill them.
Emmanuelle: Some institutions have documented dates of existence: Published by Advanced Graphics London, 1969-present in http://collection.britishart.yale.edu/id/page/object/48770
Emmanuelle: example of creator that is an institution: Monro School (but no documented dates of existence in our system): http://collection.britishart.yale.edu/id/page/object/5981
Example of institution as Rights Administrator: Design and Artists Copyright Society in http://collection.britishart.yale.edu/id/page/object/5054
<person-institution/142> crm:P92i_was_brought_into_existence_by <person-institution/142/birth> ; <person-institution/142/birth> a crm:E63_Beginning_of_Existence ; crm:P4_has_time-span <person-institution/142/birth/date>
Same for death
If you know whether it's a person or institution then use the respective specific subprop & subclass instead of the generic P92i, E63:
Person: P98i_was_born, E67_Birth
Group: P95i_was_formed_by, E66_Formation
Lec: we have more variety in Person dates.
Emmanuelle: to provide some more examples
- Richard Wilson, 1712 or 1713-1782 in http://collection.britishart.yale.edu/id/object/423
- Damien Hirst, born 1965 in http://collection.britishart.yale.edu/id/object/4908
- John Samuel Agar, ca. 1770-after 1820 in http://collection.britishart.yale.edu/id/page/object/26383
- Print made by John Bruce, fl. 1826 in http://collection.britishart.yale.edu/id/page/object/30564
Thesaurus URIs
Use more logical URIs that reflect the nature of the resource or type, and don't reflect their genesis in existing systems:
<thesauri/event/exhibition_history> -> <thesauri/event/exhibition> (an exhibition is NOT "exhibition history") <event/some-exhibition/TMS/exhibition_history> -> <event/some-exhibition/identifier> (an identifier is NOT "exhibition history") <thesauri/identifier/TMS/exhibition_history> -> <thesauri/identifier/exhibition> (doesn't matter your system is called TMS)
Lec: this may need further discussion, we may have other types of events with IDs from other systems, however made changes per suggestion
Vlado: you have a point. If you have 2 exhibition IDs then you need to add the system acronym
Exhibition URIs
We need to make decision on URI for exhibition, originally we had a short identifier, BM suggested title, this does not always work well, eg see: ObjectID 34
- Vlado: Yes, pretty long titles in http://collection.britishart.yale.edu/id/page/object/34.
Exhibition :: An American's Passion for British Art - Paul Mellon's Legacy, 2007-2008 Exhibition :: Great British Paintings from American Collections: Holbein to Hockney, Thursday, September 27, 2001 - Sunday, December 30, 2001 Exhibition :: J. M. W. Turner - A Selection of Paintings from the Collection of Mr. and Mrs. Paul Mellon, 1968-1969
- RS doesn't care what the URI is
Getty URIs
- Don't use YCBA-specific URIs for Getty, eg
http://collection.britishart.yale.edu/id/thesauri/ULAN/500303557
This won't let your data mesh with other data using Getty. Use the official namespace that Getty just decided (20-Jun-2013)
http://vocab.getty.edu
Lec: ULAN, AAT, TGN converted to Getty URIsVlado: The URI structure is still under discussion. My suggestion is:
http://vocab.getty.edu/aat/12345678
http://vocab.getty.edu/tgn/12345678
http://vocab.getty.edu/ulan/500303557
Getty will have a board meeting Jul 15, and may decide the URL structure then. Leave it as is for now, but you'd probably have to change it one more time after they finalize- same for the scheme: currently is
<thesauri/ULAN/500303557> a crm:E74_Group , skos:Concept ; skos:inScheme <thesauri/institution> .
should be:
http://vocab.getty.edu/aat/
http://vocab.getty.edu/tgn/
http://vocab.getty.edu/ulan/
Objects
Titles
Why do you need these duplicate types?
crm:P2_has_type <thesaurus/title/Alternate-title> , <thesaurus/title/alternate> .
I'm not sure what "Repository title" is. But if it means Preferred, then this is also unnecessary duplication:
crm:P2_has_type <thesaurus/title/Repository-title> , <thesaurus/title/preferred> .
Emmanuelle to clarify.
EDG: the capitalized title types (Repository, Alternate, Verso,...) talk about the purpose of the titles, not their ranking. The 2 other title attributes (alternate and preferred) talk about the ranking/preference of the titles.
Can an object have different Repository title and Preferred title?
EDG: no, all Repository titles are always the preferred ones. But all alternate titles are not of the type Alternate. Here are all the types possible: Alternate title, Collective title, Creator's title, Exhibited title, Foreign language title, Former title, Inscribed title, Repository, Verso title.these two titles are duplicated. Keep just one of them: I suggest <title/1> for uniformity with the alternate title(s)
<object/19850/title/1> a crm:E35_Title ; rdfs:label "Malvolio Dancing" ; crm:P2_has_type <thesaurus/title/Repository-title> , <thesaurus/title/preferred> . <object/19850/title/primary> a crm:E35_Title ; rdfs:label "Malvolio Dancing" ; crm:P2_has_type <thesaurus/title/Repository-title> , <thesaurus/title/preferred> .
(optional) Indicate the title language:
<object/19850/title/1> a crm:E35_Title ; rdfs:label "Malvolio Dancing"@en ; P72_has_language <thesaurus/language/english>.
Emmanuelle: we indicate the language of the titles only if they are in foreign language <thesaurus/title/ForeignLanguage-title>, and probably not consistently. All the other titles are understood as being in American English, the official language of the YCBA.
Acquisition
The YCBA director does not want to publish which person gave up the object.
Maybe if the person has passed, that can be exported. Emmanuelle to clarify.
Emmanuelle''s answer: Right now, we absolutely cannot publish who were the previous owners of our objects, no matter if they have passed or not.
<object/19850/acquisition> a crm:E10_Transfer_of_Custody , crm:E8_Acquisition ; crm:P22_transferred_title_to <thesauri/ULAN/500303557> ; crm:P29_custody_received_by <thesauri/ULAN/500303557> ; rdfs:label "Yale Center for British Art, Paul Mellon Collection" .
format the label as "Transferred to ..." (now it reads as an Agent, not as a Transfer)
If YCBA is incorporated, use E40_Legal_Body instead of the more generic E74_Group:
<thesauri/ULAN/500303557> a crm:E74_Group , skos:Concept ; skos:prefLabel "Yale Center for British Art" .
The acquisition label (and the Credit Line facet here) show that there are several sub-agents (or sub-colletions) under it: Paul Mellon Collection, Paul Mellon Fund, Gift of Mr. and Mrs. J. Richardson Dilworth, B.A. 1938.
If it's important to preserve this information in RDF, you could create sub-agents under YCBA, eg like this:<person-institution/ycba_mellon_collection> a E74_Group, skos:Concept; skos:inScheme <person-institution/>; rdfs:label "Yale Center for British Art, Paul Mellon Collection" ; skos:broader ulan:500303557; P107i_is_current_or_former_member_of ulan:500303557 . <person-institution/ycba_dilworth_gift> a E74_Group, skos:Concept; skos:inScheme <person-institution/>; rdfs:label "Yale Center for British Art, Gift of Mr. and Mrs. J. Richardson Dilworth, B.A. 1938" ; skos:broader ulan:500303557; P107i_is_current_or_former_member_of ulan:500303557 .
Images
this was wrong, see image_objects_carriers@crmg
<object/7> P62_depicts <object/7/image/1>
The correct one is:
<object/7> P65_shows_visual_item <object/7/image/1>
Explanation: an E24_Physical_Man-Made_Thing P128_carries an E73_Information_Object. In the particular case of E38_Image, we say E24_Physical_Man-Made_Thing P65_shows_visual_item E38_Image
this is wrong
<object/7/image/1> P108i_was_produced_by <object/7/image/1/creation>. # images are conceptual, so use P94i_was_created_by <object/7/image/1/creation> P14_carried_out_by <object/thesauri/actor>; # by WHOM?! If you have no info, don't output Creation rdf:type crm:E12_Production. # E12_Produciton is for material objects
- Emmanuelle: I was trying to express the fact that the image is supplied/was made by YCBA, hence P108i_was_produced_by.
- Vladimir: ok, but use the correct type and properties for Image (a conceptual object), and YCBA's URL:
<object/7/image/1> P94i_was_created_by <object/7/image/1/creation>. <object/7/image/1/creation> P14_carried_out_by <thesauri/ULAN/500303557>; rdf:type crm:E65_Creation.
Emmanuelle to Vladimir: I had originally used P94i in my original modeling but then changed it because we are talking about a physical/digital image. So ok, good to go with P94i.
Image Views
RKD has the following image views (in separate thesauri):
- area captured: overall, detail, from left, from bottom...
- side captured: front, back
- object status: before treatment, during treatment, after treatment
- documentation type: X-ray film, action photograph, black and white detail photograph, black and white photograph, color transparency etc
(BM doesn't have such notions.)
Yale has similar notions. Our lovely Miss Prue http://collections.britishart.yale.edu/vufind/Record/1669236 has 8 images:
- cropped to image, recto, unframed
- recto, unframed
- framed, recto
- framed, verso
- detail, recto
- detail, recto
- Composite X-radiograph
- cropped to image, recto, unframed
Yale needs to represent this. For now I think it's enough to lump them in one thesaurus, eg:
<http://collection.britishart.yale.edu/id/object/5005> PX_has_main_representation <http://deliver.odai.yale.edu/content/repository/YCBA/object/5005/type/2/format/1>. <http://deliver.odai.yale.edu/content/repository/YCBA/object/5005/type/2/format/1> P2_has_type <yale/thes/image/kind/cropped_to_image>, <yale/thes/image/kind/recto>, <yale/thes/image/kind/unframed>.
All Image Sizes
format/1: thimbnail (for result list)
format/2: large (for lightbox)
format/3: full screen (1920)
format/6: for publication (3000)
format/7: deep zoom JPEG2000 (eg 4492)
No need to repeat
P138i_has_representation <http://deliver.odai.yale.edu/content/repository/YCBA/object/5005/type/2/format/1> PX_has_main_representation <http://deliver.odai.yale.edu/content/repository/YCBA/object/5005/type/2/format/1>
would be nice to record some metadata in RDF (eg MIME type, resolution)
http://deliver.odai.yale.edu/content/repository/YCBA/object/7/type/2/format/1: thumbnail JPG
http://deliver.odai.yale.edu/content/repository/YCBA/object/7/type/2/format/3: large JPG
http://deliver.odai.yale.edu/content/repository/YCBA/object/7/type/2/format/6: very large TIF (download, after captcha)http://deliver.odai.yale.edu/info/repository/YCBA/object/7/type/2-output-jsonp-callback: broken link, is not an image, remove from RDF
Lec: the rigth link is this, and has lots of metadata about the images
http://deliver.odai.yale.edu/info/repository/YCBA/object/7/type/2?output=jsonp&callback
Use http://jsonviewer.stack.hu/ to format the metadata
Deep Zoom images
TODO Vlado
We need them to implement Image Annotation of Yale paintings. There are many Deep Zoom images per object, eg
- main: http://collection.britishart.yale.edu/danprac/iip.html?image=1b747e8f-7754-482c-b5a2-9e1dc1986f4b&credit=Sir%20Joshua%20Reynolds,%201723-1792,%20British,%20Mrs.%20Abington%20as%20Miss%20Prue%20in%20"Love%20for%20Love"%20by%20William%20Congreve,%201771,%20Oil%20on%20canvas,%20Yale%20Center%20for%20British%20Art,%20Paul%20Mellon%20Collection
- X-Ray: http://collection.britishart.yale.edu/danprac/iip.html?image=8621278e-1ebc-4548-8834-0124ee800c41&credit=Sir%20Joshua%20Reynolds,%201723-1792,%20British,%20Mrs.%20Abington%20as%20Miss%20Prue%20in%20"Love%20for%20Love"%20by%20William%20Congreve,%201771,%20Oil%20on%20canvas,%20Yale%20Center%20for%20British%20Art,%20Paul%20Mellon%20Collection
- Note: right now I get "Error: No response from server http://scale.ydc2.yale.edu/iiif", let's hope that's temporary
Image Rights
This says nothing (has no fields)
<object/7/image/1> P104_is_subject_to http://collection.britishart.yale.edu/id/page/object/7/image/1/restrictionthis has the following problems:
<object/7/image/1> P70i_is_documented_in <object/7/image/1/terms_of_use>. # 1. should be P104_is_subject_to <object/7/image/1/terms_of_use> # 2. it's not specific to this image, so don't use per-image node rdfs:label "http://hdl.handle.net/10079/w6m90dq"; # 3. should be URI not string, 4. this redirects, so just use the final destination rdf:type crm:E62_String. # 5. means nothing. So-called "CRM Primitive types" should not be used
- so simply use this:
<object/7/image/1> P104_is_subject_to <http://britishart.yale.edu/terms/imaging/unrestricted>. # 6 <object/5054/image/1> P104_is_subject_to <http://britishart.yale.edu/terms/imaging/under_copyright>. # 7 <http://britishart.yale.edu/terms/imaging/unrestricted> a E30_Right; rdfs:label "Public Domain". <http://britishart.yale.edu/terms/imaging/under_copyright> a E30_Right; rdfs:label "Under Copyright: © Estate of the Artist".
- 8. better yet, use CreativeCommons URIs, since CC is a stronger authority about rights than YCBA.
But this is optional: if YCBA has defined 15 different rights, it won't be easy to match them to CC URIs.
- so simply use this:
Discussion:
- Emmanuelle: I have some contextual information regarding my modeling for image rights that might help, since we are doing things a bit differently from the BM on this I believe.
- Vladimir: indeed, BM claims rights (eg images\assets_0.trig):
<http://collection.britishmuseum.org/id/object/MCT3411> crm:P138i_has_representation <http://www.britishmuseum.org/collectionimages/AN00589/AN00589075_001_l.jpg>. <http://www.britishmuseum.org/collectionimages/AN00589/AN00589070_001_l.jpg> crm:P105_right_held_by thesIdentifier:the-british-museum.
- Vladimir: indeed, BM claims rights (eg images\assets_0.trig):
- Emmanuelle: YCBA does not claim rights over images, just points to image use page, hence P70i_is_documented_in rather than P104_is_subject_to. YCBA does not say who owns the image rights.
- Vladimir: My example above (6) says the image P104_is_subject_to a Rights object, which allows unrestricted usage. See the scope note to be convinced this is the right class to use: "This class comprises legal privileges concerning material and immaterial things". It doesn't say YCBA holds any rights.
- For (7) it would be nice to compute the actual holder of rights and state P105_right_held_by but that's not easy
- Emmanuelle: OK, i see that P104_is_subject_to is good even when no image restrictions apply. Then let's use P104_is_subject_to
Object Rights
- PX_has_copyright "Public Domain" is unnecessary since you have it structured as
http://collection.britishart.yale.edu/id/page/object/7/object-rights.
If you want to output a string, use PX_display_wrap - http://collection.britishart.yale.edu/id/page/object/7/object-rights
is a completely unnecessary intermediate node - This should not be per-object
http://collection.britishart.yale.edu/id/object/7/public-domain - so overall, use just this in object data:
<object/7> P104_is_subject_to <http://britishart.yale.edu/terms/public-domain>.
And this in thesaurus data, not per-object:
<http://britishart.yale.edu/terms/public-domain> a crm:E30_Right; rdfs:label "Public Domain".
- better yet, use a CreativeCommons URI, since CC is a stronger authority about rights than YCBA
Acquisition
P30_transferred_custody_of is wrong direction
Lec: Replaced with P30i_custody_transferred_through
Concepts
- useless intermediate node
http://collection.britishart.yale.edu/id/page/object/20049/concept/1- just use
<object/20049> P129_is_about <http://collection.britishart.yale.edu/id/thesauri/AAT/124118>, <http://collection.britishart.yale.edu/id/thesauri/AAT/120779>. # etc
- just use
if you can't find a term during mapping, report an error, don't export it as "-1"
http://collection.britishart.yale.edu/id/thesauri/AAT/-1
Lec: Emmanuelle, please have some students go through TMS, I exclude now anything that has -1Lec: Emmanuelle there are cases where subjects are TGN where conceptID = 0, I will try to ignore. Example ObjectID = 34
Production
When and why to use <obj/production/M/association> a ycba:EX_Association
That's defined by the specific sections in BM Association Mapping v2. The Intro section describes 3 patterns: code in Event, code in Subevent, code in Association, and the specific sections say which to use for which part of your data
No point using BOTH Subevents (parts) and EX_Association
Produced By Specific Process
What do I do for the following crm:P14_carried_out_by association codes? Do they become types or labels?
These are all types of Production sub-events, because they pertain to the nature of the production process. See BM Association Mapping v2#Produced By Specific Process for the pattern:
- AR-Artist
- AU-author
- FB-finished by
- M-maker
- R-printer (printed by)
- PM-printmaker (print made by)
- Z-publisher (published by)
- TU-touched up by (print)
BTW: if you have just 1 code, it would be nice to optimize and not create sub-events, but BM doesn't do that
Unknown Artist
Some records say "production performed by Unknown Artist".
Lec: removed unknown artists, this will need to be communicated with Emmanuelle, she has some reservations about it. In effort to get our data to work with RS I made the change.
- Vladimir's considerations:
- RS is agnostic about this
- CONS: If you say 10 paintings are made by Unknown Artist and use the same term, that's false because they may have been made by different people.
- CONS: The sem web way of expressing that some info is missing is simply NOT to say anything.
- PRO: if you want to search in RS for "paintings with Unknown creator", you can do it when mapped to an Unknown person or Unknown group. But you cannot currently search for "paintings that don't have creator info"
- Ken: Interesting conundrum about "Unknown" but it seems to be exactly as we know the world in traditional data. If one searches simply for Unknown, one does get everything Unknown, and we realize the works are not all by the same maker and that isn't a problem because we understand that Unknown is a class not an individual.
- Dominic: I am not sure that a person URI should refer to a generic 'unknown' which would be the same unknown person. Generally, if something is unknown it shouldn't have a triple. The absence of a triple means that it is not known. Producing triples that state a null doesn't seem useful. Is there a particular reason for specifically stating that an artist is unknown? A query for an object created in the 18th century, in the french style and that was produced by the technique of sculpting would return the result in the example. Further, if the person was unknown then you couldn't assume that s/he was French. In other words if there is a way to query to get the result required without a null then this seems preferable.
- Emmanuelle: I agree with Ken. 'Unknown creator' for an art historian carries meaningful information. It is not synonym with a null value (of course since all works of art have creators). Rather it means that the attribution research has not come up with conclusive information for now, and that situation can last for many years even centuries.
- Ken: Such a search however is usually combined with limiting qualifiers like "Unknown French 18th-century sculptor." Being unable to search in that fashion seems to me the much more limited option.
- Vladimir: You can express "French" and "Sculptor". Here's how the BM thesauri are modeled:
<http://collection.britishmuseum.org/id/person-institution/207075> a crm:E21_Person, skos:Concept; skos:inScheme id:person-institution; skos:prefLabel "Alfonso Ruspagiari"; bmo:PX_gender <http://collection.britishmuseum.org/id/thesauri/gender/male>; bmo:PX_nationality <http://collection.britishmuseum.org/id/thesauri/nationality/Italian>; bmo:PX_profession <http://collection.britishmuseum.org/id/thesauri/profession/sculptor/medallist>.
Nationality and Profession are modeled as Groups (Gender is not, it's merely a Type):
bmo:PX_gender rdfs:subPropertyOf crm:P2_has_type . bmo:PX_nationality rdfs:subPropertyOf crm:P107i_is_current_or_former_member_of . bmo:PX_profession rdfs:subPropertyOf crm:P107i_is_current_or_former_member_of .
In RS the search relation (FR) "created by" is transitive over P107i_is_current_or_former_member_of.
So if you search by "French" or "scupltor" you'll find objects created by the respective nationality or profession.
If the artist is unknown, you can still say that the E74_Group "French" and the E74_Group "scupltor" P14i_performed the Production, and the same searches will work. - Vladimir: As for "18th century": RS cannot currently search by dates of the creator (life and flourit), but it can search by creation dates, which I think is "close enough".
Closely Related Group
- Emmanuelle: I think it would make sense to model 'unknown artist' as a group because it is a denomination that actually contains many different entities (all unknown artists are not the same), and also because traditionally when we speak we create a short cut but we actually cannot be sure that 'unknown artist' is only one artist for a work of art.
- The National Gallery has grappled with this as well and models 'Unknown Artist' as a sub-group of E74: http://research.ng-london.org.uk/wiki/index.php/Category:EN41-Artist_Sub_Group:
The production of a particular painting (E84) was carried out by an Unknown Artist (E21) who was known to be part of the EN41.Artist_Sub_Group of a known/individually defined Master (E21) - I have represented this logic in the attached graph. On the model of the NG I have also moved Circle of, Studio of and Workshop of to be Production association codes for E74_Group.
YCBA Production2 unknown creator.pdfviewpdf: The viewfile macro is unable to locate the attachment "YCBA Production2 unknown creator.pdf" on this page
- The National Gallery has grappled with this as well and models 'Unknown Artist' as a sub-group of E74: http://research.ng-london.org.uk/wiki/index.php/Category:EN41-Artist_Sub_Group:
- Vlado: this closely follows BM Association Mapping v2#Production by Closely Related Group, and it's more faithful modeling than before. But it has implications for Search that need to be discussed with BM.
Need to track the discussion at BM Association Mapping Problems#Closely Related Group.
- Vlado: not all these codes are the same, and they map to different CRM constructs. I would group the NG codes as follows:
- EN20-Studio of, EN23-Circle of, EN24-Workshop of, EN25-School of: BM Association Mapping v2#Production by Closely Related Group
- EN44-Associate of, EN21-Follower of, EN22-Imitator of, EN42-After: BM Association Mapping v2#Influenced By
- EN43-Attributed to: BM Association Mapping v2#Probably/Unlikely Produced By
- Some of these are subject to debate. Eg BM mapped two codes that sound similar to me onto different patterms:
AJ: Circle/School of: BM Association Mapping v2#Production by Closely Related Group
S: School of/style of BM Association Mapping v2#Influenced By - Ken: Interesting conundrum about "Unknown" but it seems to be exactly as we know the world in traditional data. If one searches simply for Unknown, one does get everything Unknown, and we realize the works are not all by the same maker and that isn't a problem because we understand that Unknown is a class not an individual.
- Vlado: That's ok for this search case, as used by a person. But still there's a falsehood in the RDF, that all these paintings are by the same person. If you want to e.g. investigate painting similarity, this will trip you up. IMHO to avoid spurious unification, an Unknown should not be a known term or URI. Could be a blank (URI-less) node, which is by definition unique.
- Emmanuelle: traditionally when we speak we create a short cut but we actually cannot be sure that 'unknown artist' is only one artist for a work of art.
- Vlado: Exactly!
- Vlado: seems to me there are different degrees of unknown that may need different modeling, e.g.:
- http://collection.britishart.yale.edu/id/page/object/7 has 2 records:
<object/7/production/1> crm:P14_carried_out_by <id/person-institution/1180> . # Production by :: Formerly attributed to Benjamin Williams Leader <object/7/production/3> crm:P14_carried_out_by <unknown> .
This says there were 2 painters, one is B.W.Leader and the other Unknown: which does not reflect the actual situation (there is 1 painter!)
Note: We should qualify production/1 with an EX_Association having type "unlikely/formerly-attributed-to" (this is described well at the page BM Association Mapping v2 - In another case there may be actual evidence of 2 producers (e.g. Rembrandt and someone unknown from his Workshop).
- http://collection.britishart.yale.edu/id/page/object/7 has 2 records:
- Vlado: The simplest solution could be to just mark with a flag "there's significant unknown info about the producer".
Curatorial Comment
Yale: PX_curatorial_comment needs date and author added to the data model
- Vlado: this is clearly a case of EX_Association. It's is a subclass of E13_Attribute_Assignment, so see attribute_assignment@crmg and recorder@crmg:
<obj> bmo:PX_curatorial_comment "comment". <obj/comment/1> a bmo:EX_Association; P140_assigned_attribute_to <obj>; P141_assigned "comment"; bmo:PX_property bmo:PX_curatorial_comment; P14_carried_out_by <researcher>; P4_has_time-span <obj/comment/1/date>. <obj/comment/1/date> P82_at_some_time_within "2013-07-06"^^xsd:gYear.
- you could alternatively use P3_has_note instead of PX_curatorial_comment, to accommodate apps that know CRM and EX_Association but not PX_curatorial_comment. But I still think using the subproperty is better
Inscriptions
I haven't seen any Inscription info in TTL. And in LIDO I see only empty XML tags for teh following types but without data:
Are there any objects with inscriptions for me to check?
Object Type and Genre
Object Type and Genre (lido:objectWorkType) are missing, eg
- There is a term for the 1st but it's not connected to the object.
<http://collection.britishart.yale.edu/id/thesauri/AAT/300033618>
- You try to make a term for the 3rd but seems look it up in the wrong thesaurus (it's local not AAT):
<http://collection.britishart.yale.edu/id/thesauri/AAT/-1> a crm:E55_Type , skos:Concept ; skos:inScheme <http://collection.britishart.yale.edu/id/thesauri/subject> ; skos:prefLabel "architectural subject" .
I'd represent the first 2 as "type" and the last one as "subject":
<http://collection.britishart.yale.edu/id/object/7> P2_has_type <http://vocab.getty.edu/aat/300033618>, # painting <http://vocab.getty.edu/aat/300015636>; # landscape P62_depicts <http://collection.britishart.yale.edu/id/thesauri/subject/7>. # architectural subject
Note: BM has defined some subprops of P2_has_type: PX_object_type, PX_ware (for pottery), PX_escapement (for clocks). But so far I don't see a need to do this for Yale.
Current Owner, Keeper
This is probably wrong, it should describe the specific sub-organization (department):
crm:P50_has_current_keeper <http://collection.britishart.yale.edu/id/thesauri/department> ;
If you don't have departments in YCBA, just don't say anything about departments.
Current Location
You currently use per-object place representations. Such places are not searchable, and
<http://collection.britishart.yale.edu/id/object/5005/location/1> a crm:E53_Place ; rdfs:label "Bay25" . # UnitType <http://collection.britishart.yale.edu/id/object/5005/location/2> a crm:E53_Place ; rdfs:label "401" . # SubSite <http://collection.britishart.yale.edu/id/object/5005/location/3> a crm:E53_Place ; rdfs:label "Yale Center for British Art" . # Site <http://collection.britishart.yale.edu/id/object/5005/location/4> a crm:E53_Place ; rdfs:label "New Haven" . # Geo location
Recommendations:
it probably doesn't make sense to put location/1 and location/2 in a thesaurus, so they are correctly per-object. But add something to the label to explain what they mean. If there's a hierarchy between them and the coding won't get too complicated, something like this could be best:
rdfs:label "Storage unit: Bay25, shelf: 401"
location/3: YCBA is an organization. Here you mean "the place of that org", which is a known conundrum. Since you already say "YCBA is current owner/keeper", just skip
location/4: A city is a well-known place, so just use the respective TGN URI
Dimensions
You state object dimensions (including their properties):
crm:P43_has_dimension <http://collection.britishart.yale.edu/id/object/7/height> ; crm:P43_has_dimension <http://collection.britishart.yale.edu/id/object/7/width> ;
But you omit lido:extentMeasurements, i.e. don't state what was measured:
If you have data with lido:qualifierMeasurements, we should also consider it
You can add it like this:
<http://collection.britishart.yale.edu/id/object/7> P39i_was_measured_by <http://collection.britishart.yale.edu/id/object/7/measurement>. <http://collection.britishart.yale.edu/id/object/7/measurement> a E16_Measurement ; P40_observed_dimension <http://collection.britishart.yale.edu/id/object/7/height> , <http://collection.britishart.yale.edu/id/object/7/width> ; rdfs:label "12 1/16 x 16 inches (30.6 x 40.6 cm). Extent: Support (PTG)" .
If "Support (PTG)" is a controlled value, you better make a thesaurus for it and we'll figure how to attach it with P2_has_type.
Bibliography
Page numbers, eg for http://collection.britishart.yale.edu/id/bibliography/2775