Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version. Compare with Current  |   View Page History

Notes from May-Jul 2013
RS-1874

Intro

Lec: (Our mapping) is same as Dominic's manual (to best of our understanding)
Vlado: Does it comply with BM's latest changes to modeling Association codes (esp re Acquisition, Production)? BM Association Mapping v2. Dominic's document probably reflects this, but these are recent changes and I haven't checked.

YCBA uses the following systems:

  • BededWork = calendaring
  • TMS = art collections
  • Drupal = website, exhibitions etc
  • Orbis = books etc

Getting Yale On-board

Lec: If there is something else you need to get this to work with Research Space please let me know
Vlado: Once it's compliant, we should:

  • try loading the data
  • load your thesauri. Complete Getty or only the subset used by your objects? How about Broader terms?
  • implement Image Annotation over your DeepZoom images (we also use IIP Image for RKD images, so shouldn't be too hard)
  • coreference some of your terms to enable cross-collection search
  • create RForms for your objects

See RS Plan 3.7#Get Yale on-board with ResearchSpace for details. As of 08-Jul-2013, this iteration is under planning and the exact scope and start is not clear. I think we can get Yale on board before mid-Sep (the Getty meeting), but it depends on the exact scope.

It appears that Yale is not the bottleneck: starting the RS3.7 iteration is. So please let's not rush this review process!

Legend

Please don't use color or strikethrough, use the following symbols for easier tracking (easiest to edit them in Wiki Markup mode):

  • open issue
  • resolved issue
  • issue under discussion

Eyeball

Lec: I am reviewing for any typos, missing types...

  • Have you tried Eyeball? See here: RDF Validation and Conversion#Eyeball
    RS-1071
    Lec: We tried Eyeball, no luck have to contact dev community as we were not able to install it after number of tries. TBD..

Problems

General

  • Pubby prefixes are not setup: shows "?:..."
    Lec: does same for BM, will try to fix
  • STRONGLY Suggest to have 1 URI per object, not 3 sameAs URIs
    Lec: BM has multiple, followed their lead
    Vlado: RS currently cannot work with these sameAs (eg would return results in triplicate). BM puts sameAs in separate files that we don't load
  • don't emit prefixes you don't need: lccn, oclc, ycba_aat, etc etc
  • crm:PX_* (e.g. crm:PX_display_wrap) is wrong, should be bmo:PX_*
    Lec: fixed with http://collection.britishart.yale.edu/id/ontology/PX_display_wrap
    Vlado: Please use bmo:PX_* and not ycba:PX_*: don't create a second property with the same purpose.
  • Please use bmo:EX_Association and not ycba:EX_Association: don't define your own class for the same purpose.

Connected Resources

TODO Vlado: write down appropriate representations

It's best to list all URLs closely related to each object, eg
http://collection.britishart.yale.edu/id/object/5005
"Mrs. Abington as Miss Prue in Love for Love by William Congreve":

Thesauri

Whole Getty or Parts

Currently you emit thesaurus data together with the object, and only the terms used in the object:

  • This way you miss Broader terms, so eg a search for "Animal" or "Mammal" won't find FR Transitivity
    • Lec: regarding Places, why don't we use Geographic Coordinates (included by Yale) and search by a bounding box?
    • Vlado: RS currently doesn't have bounding box search, because BM Places thesaurus doesn't include Geographic Coordinates.
      RS has place name search, that uses the place hierarchy.
  • This way you repeat data about the same term in many objects
  • If you start emitting objects in separate graphs like BM does (to be able to easily replace/delete), the term data will be duplicated in each of these object graphs
  • In several cases these in-object ad-hoc terms don't satisfy Thesaurus Requirements (see next section)

My strong recommendation is to export the complete Getty thesauri

  • we shouldn't wait for Getty to do an official mapping, since it'll take a few months for TGN and ULAN, and it won't satisfy the requirement to publish as CRM (see next section)
  • I can do this mapping. I'll also be involved in the Getty's mapping, so that's a good synergy
  • Getty's committed to publish as LOD, so hopefully they won't object, as soon as we mark our export as Unofficial
  • use a separate thesaurus export config, like BM does

I think this is the biggest outstanding Yale task.

Thesaurus Requirements

You must comply with BMX Issues#Thesaurus requirements
Each term should be both a CRM entity of appropriate type, and skos:Concept.

  • You have this for some (eg Agents) but not others (eg Title Type).

Meta-Thesaurus

  • Each thesaurus (ConceptScheme) used by Yale should be described in Meta-Thesaurus and FR Names#YCBA Thesauri (this section will be merged to the rest of the table)
    • This applies to both Getty and YCBA local thesauri
    • TODO RS: extend RS to handle AAT, which is one ConceptScheme that includes a number of facets (hierarchies) for object type, material, technique, etc

Emmanuelle: It would be helpful to briefly go over the definitions for searchable and tagable.

  • Searchable is a thesaurus that can be used in FR search. The list of FRs is Meta-Thesaurus and FR Names#FR Names Table and the detailed definitions are FR Implementation. Examples:
    • BM Object is searchable using FR2_has_type "is/has/about" because it's mapped to P2_has_type of the object
    • BM Ware and BM Currency are searchable using the same FR because they are sub-properties of P2_has_type
    • BM Aspect is not searchable because it's P2_has_type of E55_Type of a E25 Man-Made Feature on the object (side of coin)
    • If IPTC code is similar to "subject", it should be searchable
    • BM Unit and BM Dimension are not searchable because they are attributes of a Dimension of the object, and there's no FR defined for dimensions
    • BM Place is searchable, even in a hierarchical way
    • BM Place Type (town, village) and BM Place Name Type (modern, archaic) are not searchable because FRs don't reach into the properties of a place
  • Taggable: whether the thesaurus is "interesting enough" to be used as a source of tags. Tags are general categories to be used for categorization of research questions and comments. See Tags Spec

Term Distribution

  • Yale: 99% of Yale terms come from TGN, AAT, and ULAN.
    1% of terms come from ODNB, IconClass, YCBA Local terms (Frames, ...)
    • Vlado: AAT surely has Frames?
  • Yale: example of a lesser known person (Elihu Yale) who's found in ODNB, VIAF, DBPedia but not ULAN: http://www.oxforddnb.com/view/article/30183 http://viaf.org/viaf/46310522/ http://dbpedia.org/page/Elihu_Yale
    • Vlado: such "local heros" are a typical pattern for any museum. BM People also has "local heros" that are not found in ULAN.
  • Lec: will it be helpful if we make connections to ODNB, VIAF, DBPedia?
    • Vlado: yes, assuming you can easily export such term data according to Thesaurus Requirements. If you source it from these external sources, you'd need to make the same SKOS & CRM mapping as for the rest, and register in the Meta-Thesaurus.
      If these are indeed less than 1%, I'd source them from a single thesaurus YCBA Local.

There will be a meeting at Getty in September 2013, with 1/2 day discussion on Vocabularies

Agents

  • here crm:E55_Type is wrong: a Group is not a Type
    <thesauri/nationality/British> a crm:E55_Type , crm:E74_Group , skos:Concept ;
  • SKOS says one prefLabel (per language). If you don't have a flag in TMS, call the first one prefLabel and the rest altLabel
    <person-institution/142> a crm:E21_Person , skos:Concept ;
    	skos:inScheme ycba:person-institution ;
    	skos:prefLabel "Robert Smirke I" , "Robert Smirke R. A." , "Robert Smirk" , "Robert I Smirke" , "Robert Smirke" ;
    
  • you don't have any date (P82_at_some_time_within) for <person-institution/142/birth/date>. This makes all the following statements useless, so kill them.
    <person-institution/142>
    	crm:P92i_was_brought_into_existence_by <person-institution/142/birth> ;
    <person-institution/142/birth> a crm:E63_Beginning_of_Existence ;
    	crm:P4_has_time-span <person-institution/142/birth/date>
    
    • Same for death
  • If you know whether it's a person or institution then use the respective specific subprop & subclass instead of the generic P92i, E63:
    Person: P98i_was_born, E67_Birth
    Group: P95i_was_formed_by, E66_Formation

Thesaurus URIs

  • Use more logical URIs that reflect the nature of the resource or type, and don't reflect their genesis in existing systems:
    <thesauri/event/exhibition_history> -> <thesauri/event/exhibition> (an exhibition is NOT "exhibition history")
    <event/some-exhibition/TMS/exhibition_history> -> <event/some-exhibition/identifier> (an identifier is NOT "exhibition history")
    <thesauri/identifier/TMS/exhibition_history> -> <thesauri/identifier/exhibition> (doesn't matter your system is called TMS)
    

    Lec: this may need further discussion, we may have other types of events with IDs from other systems, however made changes per suggestion
    Vlado: you have a point. If you have 2 exhibition IDs then you need to add the system acronym

Exhibition URIs

  • We need to make decision on URI for exhibition, originally we had a short identifier, BM suggested title, this does not always work well, eg see: ObjectID 34
  • Vlado: Yes, pretty long titles in http://collection.britishart.yale.edu/id/page/object/34.
    Exhibition :: An American's Passion for British Art - Paul Mellon's Legacy, 2007-2008
    Exhibition :: Great British Paintings from American Collections: Holbein to Hockney, Thursday, September 27, 2001 - Sunday, December 30, 2001
    Exhibition :: J. M. W. Turner - A Selection of Paintings from the Collection of Mr. and Mrs. Paul Mellon, 1968-1969
    
  • RS doesn't care what the URI is

Getty URIs

Objects

Titles

  • Why do you need these duplicate types?
    crm:P2_has_type <thesaurus/title/Alternate-title> , <thesaurus/title/alternate> .
  • I'm not sure what "Repository title" is. But if it means Preferred, then this is also unnecessary duplication:
    	crm:P2_has_type <thesaurus/title/Repository-title> , <thesaurus/title/preferred> .
  • these two titles are duplicated. Keep just one of them: I suggest <title/1> for uniformity with the alternate title(s)
    <object/19850/title/1> a crm:E35_Title ;
    	rdfs:label "Malvolio Dancing" ;
    	crm:P2_has_type <thesaurus/title/Repository-title> , <thesaurus/title/preferred> .
    <object/19850/title/primary> a crm:E35_Title ;
    	rdfs:label "Malvolio Dancing" ;
    	crm:P2_has_type <thesaurus/title/Repository-title> , <thesaurus/title/preferred> .
    

Acquisition

  • don't you know who gave up the object? Or I just hit an object that doesn't have this data?
    <object/19850/acquisition> a crm:E10_Transfer_of_Custody , crm:E8_Acquisition ;
    	crm:P22_transferred_title_to <thesauri/ULAN/500303557> ;
    	crm:P29_custody_received_by <thesauri/ULAN/500303557> ;
    	rdfs:label "Yale Center for British Art, Paul Mellon Collection" .
    
  • format the label as "Transferred to ..." (now it reads as an Agent, not as a Transfer)
  • you state a shorter title about the agent in ULAN. Pick shorter or longer and use it consistently:
    <thesauri/ULAN/500303557> a crm:E74_Group , skos:Concept ;
    	skos:prefLabel "Yale Center for British Art" ;
    

Images

  • This is wrong, see image_objects_carriers@crmg
    <object/7> P62_depicts <object/7/image/1>

    TODO Vlado: write the correct one

  • this is wrong
    <object/7/image/1> P108i_was_produced_by <object/7/image/1/creation>.   # images are conceptual, so use P94i_was_created_by
    <object/7/image/1/creation> P14_carried_out_by <object/thesauri/actor>; # by WHOM?! If you have no info, don't output Creation
      rdf:type crm:E12_Production.                                          # E12_Produciton is for material objects
    
    • Emmanuelle: I was trying to express the fact that the image is supplied/was made by YCBA, hence P108i_was_produced_by.
    • Vladimir: ok, but use the correct type and properties for Image (a conceptual object), and YCBA's URL:
      <object/7/image/1> P94i_was_created_by <object/7/image/1/creation>.
      <object/7/image/1/creation> P14_carried_out_by <thesauri/ULAN/500303557>;
        rdf:type crm:E65_Creation.
      

Image Kinds

RKD has the following image kinds (in separate thesauri):

  • area captured: overall, detail, from left, from bottom...
  • side captured: front, back
  • object status: before treatment, during treatment, after treatment
  • documentation type: X-ray film, action photograph, black and white detail photograph, black and white photograph, color transparency etc

(BM doesn't have such notions.)

Yale has similar notions. Our lovely Miss Prue http://collections.britishart.yale.edu/vufind/Record/1669236 has 8 images:

  • cropped to image, recto, unframed
  • recto, unframed
  • framed, recto
  • framed, verso
  • detail, recto
  • detail, recto
  • Composite X-radiograph
  • cropped to image, recto, unframed

Yale needs to represent this. For now I think it's enough to lump them in one thesaurus, eg:

<http://collection.britishart.yale.edu/id/object/5005> 
  PX_has_main_representation <http://deliver.odai.yale.edu/content/repository/YCBA/object/5005/type/2/format/1>.
<http://deliver.odai.yale.edu/content/repository/YCBA/object/5005/type/2/format/1>
  P2_has_type <yale/thes/image/kind/cropped_to_image>, <yale/thes/image/kind/recto>, <yale/thes/image/kind/unframed>.

All Image Sizes

Deep Zoom images

TODO Vlado

Image Rights

  • This says nothing (has no fields)
    http://collection.britishart.yale.edu/id/page/object/7/image/1/restriction
  • this has the following problems:
    <object/7/image/1> P70i_is_documented_in <object/7/image/1/terms_of_use>.  # 1. should be P104_is_subject_to
    <object/7/image/1/terms_of_use>                     # 2. it's not specific to this image, so don't use per-image node
      rdfs:label "http://hdl.handle.net/10079/w6m90dq"; # 3. should be URI not string, 4. this redirects, so just use the final destination
      rdf:type crm:E62_String.                          # 5. means nothing. So-called "CRM Primitive types" should not be used
    
    • 6. so simply use this:
      <object/7/image/1> P104_is_subject_to <http://britishart.yale.edu/terms/imaging/unrestricted>.
      <http://britishart.yale.edu/terms/imaging/unrestricted> a E30_Right; rdfs:label "Public Domain".
      
    • 7. better yet, use a CreativeCommons URI, since CC is a stronger authority about rights than YCBA

Discussion

  • Emmanuelle: I have some contextual information regarding my modeling for image rights that might help, since we are doing things a bit differently from the BM on this I believe.
    • Vladimir: indeed, BM claims rights (eg images\assets_0.trig):
      <http://collection.britishmuseum.org/id/object/MCT3411> 
        crm:P138i_has_representation <http://www.britishmuseum.org/collectionimages/AN00589/AN00589075_001_l.jpg>.
      <http://www.britishmuseum.org/collectionimages/AN00589/AN00589070_001_l.jpg> 
        crm:P105_right_held_by thesIdentifier:the-british-museum.
      
  • Emmanuelle: YCBA does not claim rights over images, just points to image use page, hence P70i_is_documented_in rather than P104_is_subject_to. YCBA does not say who owns the image rights.
    • Vladimir: that's fine: my example above (6) doesn't say YCBA claims any rights.
      It just says the image P104_is_subject_to a Rights object, which allows unrestricted usage. See the scope note to be convinced this is the right class to use: "This class comprises legal privileges concerning material and immaterial things"

Object Rights

Acquisition

  • P30_transferred_custody_of is wrong direction
    Lec: Replaced with P30i_custody_transferred_through

Concepts

  • useless intermediate node
    http://collection.britishart.yale.edu/id/page/object/20049/concept/1
    • just use
      <object/20049> P129_is_about <http://collection.britishart.yale.edu/id/thesauri/AAT/124118>,
                 <http://collection.britishart.yale.edu/id/thesauri/AAT/120779>. # etc 
      
  • if you can't find a term during mapping, report an error, don't export it as "-1"
    http://collection.britishart.yale.edu/id/thesauri/AAT/-1
    Lec: Emmanuelle, please have some students go through TMS, I exclude now anything that has -1
  • Lec: Emmanuelle there are cases where subjects are TGN where conceptID = 0, I will try to ignore. Example ObjectID = 34

Production

When and why to use <obj/production/M/association> a ycba:EX_Association
That's defined by the specific sections in BM Association Mapping v2. The Intro section describes 3 patterns: code in Event, code in Subevent, code in Association, and the specific sections say which to use for which part of your data

  • No point using BOTH Subevents (parts) and EX_Association

Produced By Specific Process

What do I do for the following crm:P14_carried_out_by association codes? Do they become types or labels?
These are all types of Production sub-events, because they pertain to the nature of the production process. See BM Association Mapping v2#Produced By Specific Process for the pattern:

  • AR-Artist
  • AU-author
  • FB-finished by
  • M-maker
  • R-printer (printed by)
  • PM-printmaker (print made by)
  • Z-publisher (published by)
  • TU-touched up by (print)

BTW: if you have just 1 code, it would be nice to optimize and not create sub-events, but BM doesn't do that

Unknown Artist

Some records say "production performed by Unknown Artist".

  • Lec: removed unknown artists, this will need to be communicated with Emmanuelle, she has some reservations about it. In effort to get our data to work with RS I made the change.
  • Vladimir's considerations:
    • RS is agnostic about this
    • CONS: If you say 10 paintings are made by Unknown Artist and use the same term, that's false because they may have been made by different people.
    • CONS: The sem web way of expressing that some info is missing is simply NOT to say anything.
    • PRO: if you want to search in RS for "paintings with Unknown creator", you can do it when mapped to an Unknown person or Unknown group. But you cannot currently search for "paintings that don't have creator info"
  • Ken: Interesting conundrum about "Unknown" but it seems to be exactly as we know the world in traditional data. If one searches simply for Unknown, one does get everything Unknown, and we realize the works are not all by the same maker and that isn't a problem because we understand that Unknown is a class not an individual. Such a search however is usually combined with limiting qualifiers like "Unknown French 18th-century sculptor." Being unable to search in that fashion seems to me the much more limited option.
  • Dominic: I am not sure that a person URI should refer to a generic 'unknown' which would be the same unknown person. Generally, if something is unknown it shouldn't have a triple. The absence of a triple means that it is not known. Producing triples that state a null doesn't seem useful. Is there a particular reason for specifically stating that an artist is unknown? A query for an object created in the 18th century, in the french style and that was produced by the technique of sculpting would return the result in the example. Further, if the person was unknown then you couldn't assume that s/he was French. In other words if there is a way to query to get the result required without a null then this seems preferable.
  • Emmanuelle: I agree with Ken. 'Unknown creator' for an art historian carries meaningful information. It is not synonym with a null value (of course since all works of art have creators). Rather it means that the attribution research has not come up with conclusive information for now, and that situation can last for many years even centuries.
  • Vladimir: You can express "French" and "Sculptor". Here's how the BM thesauri are modeled:
    <http://collection.britishmuseum.org/id/person-institution/207075> 
      a crm:E21_Person, skos:Concept;
      skos:inScheme id:person-institution;
      skos:prefLabel "Alfonso Ruspagiari";
      bmo:PX_gender <http://collection.britishmuseum.org/id/thesauri/gender/male>;
      bmo:PX_nationality <http://collection.britishmuseum.org/id/thesauri/nationality/Italian>;
      bmo:PX_profession <http://collection.britishmuseum.org/id/thesauri/profession/sculptor/medallist>.
    

    Nationality and Profession are modeled as Groups (Gender is not, it's merely a Type):

      bmo:PX_gender rdfs:subPropertyOf crm:P2_has_type .
      bmo:PX_nationality rdfs:subPropertyOf crm:P107i_is_current_or_former_member_of .
      bmo:PX_profession rdfs:subPropertyOf crm:P107i_is_current_or_former_member_of .
    

    In RS the search relation (FR) "created by" is transitive over P107i_is_current_or_former_member_of.
    So if you search by "French" or "scupltor" you'll find objects created by the respective nationality or profession.
    If the artist is unknown, you can still say that the E74_Group "French" and the E74_Group "scupltor" P14i_performed the Production, and the same searches will work.

  • Vladimir: As for "18th century": RS cannot currently search by dates of the creator (life and flourit), but it can search by creation dates, which I think is "close enough".

Closely Related Group

Curatorial Comment

  • Yale: PX_curatorial_comment needs date and author added to the data model
  • Vlado: this is clearly a case of EX_Association. It's is a subclass of E13_Attribute_Assignment, so see attribute_assignment@crmg and recorder@crmg:
      <obj> bmo:PX_curatorial_comment "comment".
      <obj/comment/1> a bmo:EX_Association;
        P140_assigned_attribute_to <obj>; P141_assigned "comment"; bmo:PX_property bmo:PX_curatorial_comment;
        P14_carried_out_by <researcher>;
        P4_has_time-span <obj/comment/1/date>.
      <obj/comment/1/date> P82_at_some_time_within "2013-07-06"^^xsd:gYear.
    
    • you could alternatively use P3_has_note instead of PX_curatorial_comment, to accommodate apps that know CRM and EX_Association but not PX_curatorial_comment. But I still think using the subproperty is better

Inscriptions

I haven't seen any Inscription info in TTL. And in LIDO I see only empty XML tags for teh following types but without data:

Are there any objects with inscriptions for me to check?

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.