View Source

{excerpt}RKD Thsauri, with area where used and examples{excerpt}
{toc}
{attachments:sortBy=name}

h1. RKD Thesauri

h2. Thesauri description

The this table  are listed Rembrandt thesauri, description for each thesauri, the area in which the thesaurus is used, is a sample of the thesaurus is provided and questions/ comments. Thesauri named Location, IconCLass, Artworks, People and Concepts are not provided by RKD but are taken from eCulture data cloud (see PNG below).

|| Thesaurus || Description || Area || ? \\ || Questions/ Comments ||
| -Locations- | | | | Not used at the moment from RKD (see the explanation in geographical terms) |
| IconClass | | | | Are you using IconClass thesaurus? Could you provide it to us? {color:#ff0000}{*}Yes, we are using IconClass in the RKDimages database. However, this is probably not exactly the same version as the one that you find in the eCulture Data Cloud. We will send the vocabulary that we are currently using for our Rembrandt data entry to you. WD, 04-11-2011{*}{color} \\ |
| -Artworks- | | | | Are you using Artworks thesaurus? Could you provide it to us? {color:#ff0000}{*}I think, but I'm not sure, that this thesaurus is a thesaurus of works of art in our RKDimages database. It has been mapped for the eCulture data cloud as part of the Multimedian project some years ago (*{color}{color:#ff0000}*[http://e-culture.multimedian.nl/]*{color}{color:#ff0000}*). No one from our team was involved in this and we know very little about it I'm afraid. But if you have specific questions about it, we could may be try to find someone who could answer those. WD, 04-11-2011{*}{color}\\ |
| Concepts | | | | Maria, what is this? |
| RKDartists | an elaborate thesaurus of artist names and other persons in the 'art historical scene', with information on name variants, life dates, dates & places of activity, references to publications, etc.) | RKDArtists | | Is RKDArtsis thesaurus part of People thesaurus? {color:#ff0000}{*}I think the database RKDartists was referred to as "People" when it was mapped for the Multimedian Project. Since then, the database has been worked on on a daily basis (adding new records, but also cleaning up of old records), so I think the RKDartists & thesaurus as we use it today, is different from the People thesaurus in the e-culture cloud. For your information: in RKDimages we also use a couple of separate vocabularies with names of individual people (such as private collectors), which we would like to merge with RKDartists in the end. But this is a very time consuming task and I don't see this happening in the next year or even the year after that. We will just send you what we use, that will make this issue more clear I think. WD 04-11-2011.*{color}\\ |
| geographical terms | cities, countries | | \\
\\
Y | Is Geographical terms thesaurus the same as Locations thesaurus? {color:#ff0000}{*}Same here as above. Since the mapping to the e-culture data cloud a lot of work has been done to clean up this thesaurus. We will send you the version that we use at the moment. WD, 04-11-2011{*}{color}\\
See Vocabulary GEOGRAPHICAL THESAURUS (Dutch).xml |
| institution names | museums, laboratories, etc | RKDImages, RKDTechnical | | |
| whereabouts type | museum, private collection, church, etc. | RKDImages, RKDTechnical | \\
Y \\ | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| object type | e.g. painting, sculpture, drawing, etc. | RKDImages, RKDTechnical | \\
Y \\ | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| shape | e.g. vertical rectangle, oval, etc. | RKDImages, RKDTechnical | \\
Y \\ | \\
see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| support | The type of material which a work of art has been made on, such as canvas, wood or stone, 208 terms, English and Dutch. This is a very simple vocabulary, with no hierarchical or other relations and no scope notes. You will see that we have not only translated the terms relevant for Rembrandt paintings, but also terms relevant for other types of objects (drawings, prints, etc.) | RKDImages, \\
RKDTechnical | Y \\
\\
Y \\ | see Vocabulary SUPPORT RKDimages(Dutch) \\
In "Vocabulary Support(English).csv" (RKDTechnical) in column term.type there are records with type "Object type", "Technique" - are these errors, or we should treat all values as Support? {color:#ff0000}{*}One term can have multiple term types. In this case, treat all terms as "Support". WD, 04-11-2011{*}{color}\\
Are you storing all thesauri for RKDImages (or RKDTechnical) in one table in the DB and distinguish based on the values in term.type? If this is so you could provide us without separating by categories. {color:#ff0000}{*}In RKDtechnical, we are storing all vocabularies in three different "tables" in the database: 1 for geographical terms, 1 for persons&institutions names en 1 for all other terms/concepts. For all of these, we could send you one csv file. However, one term can belong to multiple term types. In the csv file, you will only see one of those. If we send them to you seperately, you will find that some terms are in several csv files. So this gives more information we think. Do you agree? In RKDimages all vocabularies are in several tables, so we need to send those as separate files anyway. WD, 04-11-2011.*{color}\\
see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| Technique | e.g. oil paint, pen and brown ink, pencil, etc. | RKDImages, RKDTechnical | \\
Y \\ | \\
see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| qualification attribution | e.g. after, possibly, studio of, etc. | RKDImages, RKDTechnical | \\
Y \\ | \\
see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| In RKDtechnical: | | | | |
| persons names | researchers, conservators, etc. Will be integrated with RKDartists in due time) | RKDTechnical | | Is Persons names thesaurus part of People thesaurus? {color:#ff0000}{*}No, at this moment it is seperate. We would like to merge it, but this will not happen soon. WD, 04-11-2011{*}{color}\\ |
| research type | e.g. x-radiography, normal light studies, dendrochronology, etc.) | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| analytical techniques | techniques applied on paint samples. Will be integrated with "research types" shortly) | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| equipment | specific cameras, microscopes or other type of equipment used, with specifications) | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| computer hardware | hardware that was used to create certain documentation, such as scanners | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| software | software that was used to create certain documentation | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| research reason / objective | e.g. conservation, publication, exhibition, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| object status | e.g. before treatment, during treatment, after treatment, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| area captured | part of the painting/paint sample captured in an image, e.g. back overall, front upper right corner,    etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| magnification | for images taken with a light \\
\- or stereomicroscope | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| document type | e.g. slide 35 mm, digital-born color photograph, research report, etc.the type of written document or image created when researching or treating a work of art, 119 terms, English, Dutch and German. This vocabulary has a hierarchical structure. Other relations or scope notes are not yet included (but will be at some point). | RKDTechnical | Y \\ | In "Vocabulary DOCUMENTATION Type (English).csv": \\
1. In column term.type there are records with type "Object type", "Technique" and "Area captured" - are these errors, or we should treat all values as Document types? {color:#ff0000}{*}One term can have multiple term types. In this case, treat all terms as "Document types". WD, 04-11-2011{*}{color}\\
2. In column term.status there are values with status "candidate" - how should we treat these values (to show them to the user or not)? Please, provide if there are other statuses. {color:#ff0000}{*}Please ignore these values, they don't mean anything at the moment. WD, 04-11-2011{*}{color}\\
3. We'll use broader_term but ignore narrower_term, since one row can have multiple narrower terms. Is this true? {color:#ff0000}{*}Y{*}{color}{color:#ff0000}{*}es\! WD, 04-11-2011{*}{color}\\
\\
see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| documentation whereabouts | location where the documentation is kept within an institution, e.g. conservation studio, library, archive, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| documentation number type | numbering system which is used for certain groups of documentation within an institution, e.g. inventory number, registration number, negative number, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| reason for sampling | reason why a paint sample was taken, e.g. conservation, attribution, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| sample type | e.g. cross-section, dispersed sample, varnish sample, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| location/area type | for paint samples, e.g. flesh, foliage, sky, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| location/area color | for paint samples, e.g. red, brown, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| paint defects | for paint samples, e.g. smalt discoloration, saponification of lead white, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| paint layer function | for paint samples, e.g. ground, surface paint layer, varnish, etc. | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |
| field | for images of paint samples, e.g. bright field, dark field, etc | RKDTechnical | Y | ssee Vocabularies RKDtechnical ALL TERM TYPES.xml |
| light | for images of paint samples, e.g. normal light, uv, etc | RKDTechnical | Y | see Vocabularies RKDtechnical ALL TERM TYPES.xml |


Please note that due to the duplication between parts of RKDtechnical and RKDimages as described above, there is also a duplication in some of the controlled vocabularies we use. Both databases use for instance controlled vocabularies of institutions names, but they are not all the way identical. We intend to have everything integrated in Spring 2012.)

h2. Data to Thesauri Mapping

In the table below is done mapping between the fields in Susanna and the provided from Rembrandt thesauri. In the first column is shown the thesaurus name.

The thesaurus names which are stroke through are not received from RKD and those fields will be treated as Free Text, till further decisions. There are questions and comments in the last column "Questions/Comments"

{table-plus:autoNumber=true}











|| Thesaurus || URI of the thes. value \\ || Example Label \\ || <tag> || translation || Questions/ Comments ||
| -Artworks- | | | <benaming_kunstwerk> | object title | {color:#ff0000}{*}This is free text{*}{color}\\ |
| -Artworks- | | | <andere_benaming> | other/former title | {color:#ff0000}{*}This is free text{*}{color}\\ |
| rkd-plaats | rkd-plaats:amsterdam | "Amsterdam" | <vervaardigd_plaats_land> | country/place of making | |
| rkd-shape | rkd-shape:vertical-rectangle | "staande rechthoek"@nl | <vorm> | shape | |
| rkd-support | rkd-support:panel--oak | "paneel (eikenhout)"@nl | <drager> | support | |
| rkd-technique | rkd-technique:oil-paint | "olieverf"@nl | <materiaal> | medium/technique | |
| {color:#ff6600}rst-iconclass{color} | rst-iconclass:_71P412 \\ | "Susanna bathing, usually in or near a fountain and sometimes accompanied by two female servants"@en. \\ | <iconclass_code> | iconclass code | {color:#0000ff}Thesaurus created by Ontotext{color}\\ |
| {color:#ff6600}rkd-keywords{color} | rkd-keywords:oude_testament_apocriefen | "oude testament & apocriefen"@nl | <RKD_algemene_trefwoorden> | RKD keywords | Do you support KeyWords thesaurus? Could you provide it to us? {color:#ff0000}{*}We will provide you with the vocabulary "RKD algemene trefwoorden". WD, 04-11-2011{*}{color}\\
{color:#0000ff}Thesaurus created by Ontotext{color}\\ |
| rkd-plaats | rkd-plaats:geertruidenberg | "Geertruidenberg" | <plaats> | depicted location | |
| {color:#ff6600}rkd-frame{color} | rkd-frame:wood--gold_plated \\ | "hout, gestoken en verguld"@nl \\ | <lijstmateriaal> | material of frame | Do you use Frame materials thesaurus (Support thesaurs for frames)? {color:#ff0000}{*}Yes, will send it to you. WD, 04-11-2011{*}{color}\\
{color:#0000ff}Thesaurus created by Ontotext{color}\\ |
| -RKDartists- | | | <naam_lijstenmaker> | name of frame maker | {color:#ff0000}{*}This is free text. WD 04-11-2011{*}{color}\\ |
| {color:#ff6600}rkd-artists{color} | rkd-artists:Rembrandt | crm:P3_has_note "Rembrandt" \\ | <naam> | artist name | {color:#0000ff}Thesaurus created by Ontotext{color} |
| rkd-plaats | rkd-plaats:amsterdam | "Amsterdam" | <land_plaats_anoniem> | city or country if anonymous | |
| {color:#ff6600}rkd-artists{color} | rkd-artists:WillemDePoorter \\ | crm:P3_has_note&nbsp; "Poorter, Willem de" \\ | <kunstenaar_art_verb> | artist of related object | {color:#0000ff}Thesaurus created by Ontotext{color} |
| {color:#ff6600}rkd-collection{color} | rkd-collection:Mauritshuis \\ | crm:P3_has_note "Koninklijk Kabinet van Schilderijen Mauritshuis"@nl \\ | <collectienaam> | collection name | Royal Cabinet of Paintings Mauritshuis \\
{color:#0000ff}Thesaurus created by Ontotext{color}\\ | |
| | | | -<bruikleen_naam>- | -(MT) loan name- | Could the owner in <bruikleen_naam> be individual? In this case which thesauri do we use? {color:#ff0000}{*}Please ignore this field, we will no longer use it. WD, 04-11-2011{*}{color} |
| rkd-plaats | rkd-plaats:amsterdam | "Amsterdam" | <bruikleen_plaats> | (MT) loan place | Should <bruikleen_plaats>&nbsp;be ignored, given that <bruikleen_naam> is ignored? If not, what does it mean? |
| rkd-type_where | rkd-type-where:private-collection | "particuliere collectie"@nl | <soort_collectie_verblijfplaats> | collection type | |
| rkd-plaats | rkd-plaats:den-haag | "Den Haag" \\ | <plaats_collectie_verblijfplaats> | collection location | |
| -?auction house- | | | <veilinghuis_zc_nw> | auction house | Are the (<veilinghuis_zc_nw>) auction house names supported in Institution names thesauri, or this field is not linked to a thesauri and is free text.{color:#ff0000}{*}This is a separate vocabulary, will send it to you. WD, 04-11-2011{*}{color}\\
{color:#0000ff}Maria:{color} {color:#0000ff}Free text due to missing thesaurus{color}\\ |
| rkd-plaats | rkd-plaats:antwerpen | "Antwerpen" | <veilingplaats_zc_nw> | auction location | |
| -?People- | | | <inbrenger> | seller | Are the ( <inbrenger>) sellers names supported in any of the People/Persons Names/ RKDArtists thesauri, or this field is free text {color:#ff0000}{*}Free text. WD 04-11-2011{*}{color}\\ |
| -?People- | | | <naam_koper> | buyer | Are the ( <naam_koper>) buyers names supported in any of the People/Persons Names/ RKDArtists thesauri, or this field is free text {color:#ff0000}{*}Free text. WD 04-11-2011{*}{color} |
| {color:#ff6600}rst-currency{color} | rst-currency:HFL | "HFL" | <munt> | monetary unit | Is there any existing thesaurus for currencies supported? Should we use it for <munt> field, or it is free text. {color:#ff0000}{*}There is a vocabulary, will send it to you. WD, 04-11-2011{*}{color}\\
{color:#0000ff}Maria:{color} {color:#0000ff}Thesaurus created by Ontotext{color} |
| -institutions names- | | | <instelling_tentoonstelling/> | institution where exhibition takes place | {color:#0000ff}Maria:{color} {color:#0000ff}Free text due to missing thesaurus{color} \\ |
| rkd-plaats | rkd-plaats:berlijn | "Berlijn" | <plaats_tentoonstelling> | place of exhibition | |
| -Artworks- | | | <title> | | {color:#ff0000}{*}This is free text. This field in RKDtechnical is duplicated in RKDimages, <benaming_kunstwerk> (object title). We will use the value from RKDimages. WD, 04-11-2011{*}{color}\\ |
| -Artworks- | | | <title.other_older> | | {color:#ff0000}{*}This is free text. This field in RKDtechnical is duplicated in RKDimages, <andere_benaming> (other/formal title). We will use the value from RKDimages. WD, 04-11-2011{*}{color} |
| -shape- | | | <object.shape> | | {color:#ff0000}{*}This is free text. This field in RKDtechnical is duplicated in RKDimages, <vorm> (shape). We will use the value from RKDimages. WD, 04-11-2011{*}{color} |
| {color:#ff6600}unit{color} | unit:Centimeter | "cm"@nl | <object.size.unit> | | There is no Measurement Units thesaurus? Are you supporting such thesaurus? We plan to use QUDT thesaurus. {color:#ff0000}{*}We don't use a thesaurus for this field, but a drop down list of{*}{color} {color:#0000ff}{*}mm, cm, inch, pixels{*}{color}{color:#ff0000}*. However, the field is duplicated in RKDimages <eenheid> (unit) - not listed above, probably because it is free text - and we will use that value and not the one from RKDtechnical. WD, 04-11-2011{*}{color}\\
{color:#0000ff}Maria:{color} {color:#0000ff}Thesaurus created by Ontotext{color}\\ |
| rkd-suppor | rkd-support:panel--oak | "panel (oak)"@en | <object.support> | | {color:#ff0000}{*}This field in RKDtechnical is duplicated in RKDimages, <drager> (support). We will use the value from RKDimages. WD, 04-11-2011{*}{color} |
| rkd-technique | rkd-technique:oil-paint \\ | "oil paint"@en | <object.technique> | | {color:#ff0000}{*}This field in RKDtechnical is duplicated in RKDimages, <materiaal> (medium/technique). We will use the value from RKDimages. WD, 04-11-2011{*}{color} |
| -geographical terms- | | | -<whereabouts.city>- | | {color:#ff0000}{*}This is free text. This field in RKDtechnical is duplicated in RKDimages, <*{color}{color:#ff0000}{*}plaats_collectie_verblijfplaats{*}{color}{color:#ff0000}*>*{color} {color:#ff0000}*(collection location). We will use the value from RKDimages. WD, 04-11-2011{*}{color} |
| -institutions names- | | | -<whereabouts.name>- | | {color:#ff0000}{*}This is free text. This field in RKDtechnical is duplicated in RKDimages, <collectienaam> (collection name). We will use the value from RKDimages. WD, 04-11-2011{*}{color} |
| -RKDartists- | | | -<attribution.name>- | | {color:#ff0000}{*}This is free text. This field in RKDtechnical is duplicated in RKDimages, <naam> (artist name). We will use the value from RKDimages. WD, 04-11-2011{*}{color} |
| rkd-res_type | rkd-res_type:x-radiography | "X-radiography"@en \\ | <research.type> | | |
| {color:#ff6600}rkd-person{color} | rkd-person:A_B_de_Vries \\ | crm:P3_has_note&nbsp; "Vries, A.B. de" | <research.researcher> | | {color:#0000ff}Maria:{color} {color:#0000ff}Thesaurus created by Ontotext{color} \\ |
| {color:#ff6600}rkd-documentation{color} | rkd-documentation:X-ray_film \\ | "X-ray film"@en \\ | <doc.type> | | {color:#0000ff}Maria:{color} {color:#0000ff}Thesaurus created by Ontotext{color} |
| {color:#ff6600}rkd-person{color} | rkd-person\:P_Noble \\ | crm:P3_has_note&nbsp; "Noble, P." \\ | <doc.creator> | | {color:#0000ff}Maria:{color} {color:#0000ff}Thesaurus created by Ontotext{color} |
| {color:#ff6600}unit{color} | unit:Centimeter \\ | "cm"@nl \\ | <doc.size.unit> | | There is no Measurement Units thesaurus? Are you supporting such thesaurus? We plan to use QUDT thesaurus. {color:#ff0000}{*}See above{*}{color}\\
{color:#0000ff}Maria:{color} {color:#0000ff}Thesaurus created by Ontotext{color}\\ |
| -institutions names- | | | <doc.whereabouts> | | {color:#0000ff}Maria:{color} {color:#0000ff}Free text due to missing thesaurus{color} \\ |
| -rkd-whereabouts- \\
-\_location- | | | <doc.whereabouts.location> | | {color:#0000ff}Maria: Free Text: In the rkd-sample-reduced the value is&nbsp; "restauratieatelier", but I can't find such in the thesaurus.{color} \\ |
| -institutions names- | | | <file.image.location> | | {color:#0000ff}Maria:{color} {color:#0000ff}Free text due to missing thesaurus{color} \\ |
| rkd-objectstatus | rkd-objectstatus:after-treatment \\ | "after treatment"@en \\ | <file.spec.object_status> | | |
| rkd-area_captured | rkd-area_captured:overall \\ | "overall"@en \\ | <file.spec.overall_detail> | | |
| {color:#ff9900}rkd-area_captured{color} | rkd-area_captured:FRONT \\ | "front"@en \\ | <file.spec.front_back> | | {color:#0000ff}Maria: Thesaurus value created in rkd-area_captured by Ontotext{color} |
| rkd-area_captured | | | <reference_image.front> | | |
| rkd-area_captured | | | <reference_image.back> | | |
| -institutions names- | | | <file.application.location> | file.application.location | Vlado: not right, this says "RKD" (an abbreviation) \\
{color:#0000ff}Maria:{color} {color:#0000ff}Free text due to missing thesaurus{color}\\ |
| rkd-sam_type | rkd-sam-type:cross-section \\ | "cross-section"@en \\ | <sample.type> | | |
| rkd-an_techn | rkd-an_techn:analytical-microscopy \\ | "analytical microscopy"@en \\ | <sample.analytical_technique> | | |
| -institutions names- | | | <sample.whereabouts> | | {color:#0000ff}Maria:{color} {color:#0000ff}Free text due to missing thesaurus{color} \\ |
| -rkd-whereabouts_location- | | | <sample.whereabouts.location> | | {color:#0000ff}Maria: Free Text: In the rkd-sample-reduced the value is "restauratieatelier", but I can't find such in the thesaurus.{color} |
| {color:#ff9900}rkd-area_captured{color} | rkd-area_captured:FROM_BOTTOM | "from bottom"@en \\ | <sample.location.vert.start> | | {color:#0000ff}Maria: Thesaurus value created in rkd-area_captured by Ontotext{color} |
| {color:#ff9900}rkd-area_captured{color} | rkd-area_captured:FROM_LEFT \\ | "from left"@en \\ | <sample.location.hor.start> | | {color:#0000ff}Maria: Thesaurus value created in rkd-area_captured by Ontotext{color}{color:#0000ff}.{color} \\ |
{table-plus}

h1. Thesaurus Migration

h3. Migration from Vocabularies RKDtechnical ALL TERM TYPES.xml,



This xml file contains information about most of RKDTechnical thesauri: support, technique, reason for sampling, sample type location, area color, document type, etc (see Thesauri description section above), both in English, German and Dutch. The values from the files could be split by different thesauri based on <term.type> field. It is different for each thesauri (TECHNIQUE, OBJECT_TYP, SOFTWARE, etc.). There are terms like "cardboard" with more than one <term.type> \-&nbsp; "SUPPORT", "TECHNIQUE" and "OBJECT_TYPE". This would mean that such terms must participate in more than one thesauri.


The following tags should be migrated from RKDtechnical ALL TERM TYPES. If there are sub tags to the listed tags, they should also be migrated.

* *<priref> -* primary reference
(Vlado's comment on <priref>: Lookup from Sussana to thesaurus is by *label*.
It's nicer to generate URI from the EN label (or NL if EN is missing), see [URI Scheme - Proposal|#Rembrandtthesauri-URISchemeProposal]; but is the label unique within the thesaurus? So we may ignore this?
* *<broader_term>* (Relations - skos:broader, crm:P127 has broader term) *\-* Since&nbsp; term could have a broader term (only one) and narrower terms (more than one), we will ignore the narrower terms. A term could have a broader term or not.
* *<term.type> -* It is design solution if all the terms will be split by <term.type> and migrated in different thesauri (then one term could be present in more then one thesaurus), or the mapping "data field" - "thesaurus" will be done by <term.type>. (Vlado comment: skos:inScheme (I hope\!))
* *<term> -* the thesaurus term in Eglish, German and Dutch (there are some terms only in Dutch, others in English and Dutch)

\- <term.status> will not be migrated, because RKD think that the values are not meaningful for us. {color:#ff0000}*(*{color}{color:#ff0000}{*}Please ignore these values, they don't mean anything at the moment. WD, 04-11-2011)*{color}

Example:

{code:xml}
<record>
<priref>1256</priref>
<broader_term.qualifier occurrence="1" lang="en-US"/>
<broader_term occurrence="1" lang="en-US" invariant="true">text or graphic representation (analogue)</broader_term>
<broader_term occurrence="1" lang="de-DE">Text oder graphischer Bild (analog)</broader_term>
<broader_term occurrence="1" lang="nl-NL">tekst of grafische weergave (analoog)</broader_term>
<input.date>2009-12-17</input.date>
<edit.date>2011-09-14</edit.date>
<edit.date>2011-09-01</edit.date>
<edit.date>2011-08-22</edit.date>
<edit.date>2010-02-16</edit.date>
<term.type option="DOC_TYPE" value="DOC_TYPE">
<text language="0">Document type</text>
<text language="1">Type documentatie</text>
<text language="3">Dokumentart</text>
</term.type>
<term.type>Document type</term.type>
<input.name>RKD Technical thesaurus collation</input.name>
<edit.name>SW</edit.name>
<edit.name>wtv</edit.name>
<edit.name>wtv</edit.name>
<edit.name>rz</edit.name>
<term occurrence="1" lang="en-US" invariant="true">condition report</term>
<term occurrence="1" lang="de-DE">Zustandsbericht (analog)</term>
<term occurrence="1" lang="nl-NL">conditierapport (analoog)</term>
<edit.time>16:24:27</edit.time>
<edit.time>14:30:57</edit.time>
<edit.time>16:12:59</edit.time>
<edit.time>06:45:19</edit.time>
<term.status option="1" value="1">
<text language="0">approved preferred term</text>
<text language="1">descriptor</text>
<text language="2">descripteur</text>
<text language="3">Deskriptor</text>
</term.status>
<term.status>approved preferred term</term.status>
<input.time>23:45:06</input.time>
<input.source>term list SW</input.source>
<edit.source>thesaum</edit.source>
<edit.source>thesaum</edit.source>
<edit.source>thesaum</edit.source>
<edit.source>thesaum</edit.source>
</record>
{code}

h3. Migration from Support RKD Images (Dutch).xml

The thesaurus is keeping values for different support materials, only in Dutch. There are 213 terms in&nbsp; Support RKD Images (Dutch).xml and there are 208 terms in SUPPORT technical ( see Vocabularies RKDtechnical ALL TERM TYPES.xml, <term.type> = SUPPORT ). I made free comparison of values in both thesauri, and all the values that I checked from SUPPORT technical are present also in SUPPORT images. One value from Images that is not existing in technical, for example, is "ceramiek". In RKD Images there are 2 values: "ceramiek" and "keramiek" and "ceramiek" is <gebruikt_voor> (another use) of "keramiek". In RKD technical is existing only "keramiek".

\! (Maria) My proposal is initially not to be migrated values from&nbsp; Support RKD Images (Dutch).xml and all data fields using it to be mapped it to the terms in SUPPORT technical. Disadvantage is that 5 terms will be lost. The advantage is that RKDTechnical is multilingual, while&nbsp; RKD Images is only in dutch.

\! {color:#ff0000}(Quotation from Wietske e-mail from 4.11.2011){color} {color:#ff0000}For the{color} {color:#ff0000}Rembrandt Database, we will not use the "duplicated" thesauri, because we do not want to present the brief object information in RKDtechnical, but the much more elaborate object information in RKDimages. So in fact, all duplicated thesauri from RKDtechnical could be ignored for the mapping. However, you might like to have them, because they contain the translations in English.{color}

If a decision for migration from RKD Images is taken, then the following tags to be migrated:

* *<priref>*
* *<broader_term>* (Relations - skos:broader, crm: P127_has_broader_term)
* *<term>*

Example:


{code:xml}
<record>
<priref>160065</priref>
<term>keramiek</term>
<gebruikt_voor>ceramiek</gebruikt_voor>
</record>
<record>
<priref>160066</priref>
<broader_term>paneel</broader_term>
<term>paneel (olmenhout)</term>
</record>
{code}


h3. Migration from Vocabulary GEOGRAPHICAL THESAURUS (Dutch).xml

The xml contains detailed information about geographical places, like name of the place, broader and narrower places, equivalent names, other names (use for), detailed description and a lot of system information like date of import in the thesaurus, modification date, who imported the value, who modified etc. The values in the thesaurus are only in Dutch. (Task: Maria to send a letter to RKD, requesting&nbsp; the thesaurus in Eng.)

* *<priref>* \- primary reference
* *<ruimere_term>* (Relations \-skos:broader, crm: P89 Falls within)\- Broader term, the broader term also exists as a term. Since&nbsp; term could have a broader term (only one) and narrower terms (more than one), we will ignore the narrower terms.
* *<equivalente_term>* \- equvalent term, also exists as a term. A term could have an equivalent terms or not.
* *<term>* \- the name of the place
* *<gebruik>*&nbsp; - use (alternative name), the names listed here exit as terms in the thesaurus. <gebruik> and <gebruikt_voor> could not exist in the same record together.
* *<gebruikt_voor>* \- used for (also known as), the names listed here exist as terms in the thesaurus, they are alternative names. A term could have other names (<gebruikt_voor>) or not.


\! Our proposal is for all <equivalente_terms>, <gebruikt_voor> and <gebruik> terms to be created one URI with different labels for each term, instead of creating relation of type scos: Exact Match (in CRM there is no approriate realtion for equivalent terms). See the graph example below.


Example:

{code:xml}
<record>
<priref>43345</priref>
<ruimere_term>Zitomir (prov.)</ruimere_term>
<invoerder_datum>2007-11-14</invoerder_datum>
<wijziger_datum>2010-04-27</wijziger_datum>
<wijziger_datum>2007-11-14</wijziger_datum>
<domein option="PLAATS" value="PLAATS">
<text language="0">place</text>
<text language="1">plaats</text>
</domein>
<domein>plaats</domein>
<domein option="WOONPLAATS" value="WOONPLAATS">
<text language="0">residence</text>
<text language="1">woonplaats</text>
</domein>
<domein>woonplaats</domein>
<equivalente_term>Zitomir</equivalente_term>
<invoerder_naam>rs</invoerder_naam>
<wijziger_naam>pvp</wijziger_naam>
<wijziger_naam>rs</wijziger_naam>
<termbeschrijving>poolse benaming</termbeschrijving>
<term>Zytomierz</term>
<gebruikt_voor>This is fake value2</gebruikt_voor>
</record>
{code}



Example: Graph with values showing ruimere term, equivalent term, gebruikt voor and gebruik.


!Geographical terms.png|border=1!


h2. Input formats

RKD provided samples in several formats and are awaiting our decision which format we want.

* *XML*: includes all possible info, but is more complicated. It takes about 30 sec. to RKD to generate a thesaurus in this format.
* *CSV*. Vlado thought will be best for us, since we already have a similar conversion of BM thesauri (bm-csv2ttl.pl). But it has 2 defects:
** RKD cannot provide all languages in one&nbsp;{color:#000000}CSV (WD 04-11-2011).{color}
{color:#000000}We could{color}{color:#333333}&nbsp;process in several passes (first creates nodes, other add labels), or&nbsp;the unix "{color}{color:#000000}[join|http://www.cims.nyu.edu/cgi-comment/info2html?(textutils)join%2520invocation]{color}{color:#333333}" command (part of GNU textutils) could help{color}
** If a column has several values, only one is present. But are there viable/useful multi-valued columns??
I've seen this for "narrower", but we'll use "broader" instead. I haven't seen multiple "broader" (i.e. multi-parents)
It may happen for "equivalent" and "related", but do we need them?
** It takes about 5 min. to RKD to generate a thesaurus in this format, because they have to do this for each data language separately
* *XLS*: manual compilation from CSV, more effort for RKD (it takes about 15 min), harder to read by us...
* *DAT*: simple line-oriented format. Each line starts with a 2-letter code identifying the field (eg te=term, bt=broader_term).
Big problem: does not indicate the language (but I guess they come in a fixed order). Eg:
{noformat}
bt photograph (print)
bt Foto (Print)
bt foto (afdruk)
te black and white photograph
te Schwarz / Weiss Foto (Print)
te zwart-witfoto (afdruk)
{noformat}


Vlado to RKD: our tech guys will confirm the selected input format and document here

h2. Conversion approach

Proposed approaches and estimates (which tool, how many p/d):
- Vlado: I made a simple Perl script (bm-csv2ttl.pl) to convert BM thesauri from Merlin CSV to RDF Turtle in SKOS as part of the RS demo (before we won it).
See&nbsp;[BM Thesauri]&nbsp;for description and attachment. Uses module Text::CSV::auto.
Took me 1-2d for 3 thesauri.&nbsp;It may be easy to adapt for this task.
- Mitac:&nbsp;Java (with opencsv.jar), 1-2d for the first 2 samples. It might be easier to produce N-Triples instead of Turtle
- SSL: We suggest normalising each incoming dataset first, and then supporting a single conversion from the normalised format into RDF. &nbsp;To start off we'll convert the incoming data into the BM format and re-use Vlado's perl script. &nbsp;That will give us working code and converted data. &nbsp;If we have time, it would be good to chose a recognised standard to normalise to instead of the BM format. &nbsp;We might need to maintain/modify the perl a bit, but that should be straightforward. &nbsp;We'd like to put 4 days effort into this.
- Kalin?? (I guess it's overkill to ask 3 estimates)

h2. Output SKOS format

Vlado: I propose the following format ([https://svn.ontotext.com/svn/researchspace/data/thesauri.ttl])
{code}
@prefix rkd-material: <http://rkd.nl/thesaurus/material/> . # material/technique (of painting, frame, etc)
@prefix rkd-object: <http://rkd.nl/thesaurus/object/> . # object type
@prefix crm: <http://erlangen-crm.org/current/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .

# object type
rkd-object:painting a crm:E55_Type, skos:Concept;
skos:inScheme rkd-object:; crm:P2_has_type rkd-object:;
rdfs:label "painting"@en, "schilderij"@nl.
rkd-object:frame a crm:E55_Type, skos:Concept;
skos:inScheme rkd-object:; crm:P2_has_type rkd-object:;
rdfs:label "frame"@en, "lijst"@nl.

# material / technique
rkd-material:wood--gold_plated a crm:E55_Type, skos:Concept;
skos:inScheme rkd-material:; crm:P2_has_type rkd-material:;
skos:broader rkd-material:wood;
rdfs:label "wood, and gold plated"@en, "hout, gestoken en verguld"@nl.
rkd-material:oil_paint a crm:E55_Type, skos:Concept;
skos:inScheme rkd-material:; crm:P2_has_type rkd-material:;
rdfs:label "oil paint"@en, "olieverf"@nl.
{code}

*Important Notes*
- "rkd-object:;" is valid Turtle syntax (empty local name) and is tested in OWLIM. Could also be written as "rkd:object;", doesn't matter
- The thesauri were made up during the mapping of Susanna (sample painting) and need to be adjusted to real data.
Eg the above puts frame material (wood--gold_plated) and painting technique (oil_paint) in one skos:Scheme,&nbsp;but in the RKD data, SUPPORT is a separate thesaurus
- Maria: devise a URI scheme and propose it to RKD for clearance (it makes sense to use [http://rkd.nl] as prefix, since it's their data)
- Mariana: load SKOS ontology with appropriate reasoning, so when we assert skos:broader, it infers broaderTransitive
(and maybe narrower and narrowerTransitive: will be useful for the UI to build a thesaurus tree)

h2. URI Scheme - Proposal

For each thesaurus value we propose the following URI to be generated

[http://rkd.nl/thesaurus/_thesaurus_name/thesaurus_value\_|http://rkd.nl/thesaurus/_thesaurus_name/thesaurus_value_], where.
* [http://rkd.nl] is the prefix
* thesaurus - giving information that it is thesaurus URI
* _thesaurus_name_ \- is the name of the thesaurus
* _thesaurus_value_ \- is the concrete value from the thesaurus. URI is formed from EN or NL label:
** Space " " is replaced with underscore "_"
** Brackets "(...)" is replaced with two slashes "--" (hopefully reflecting hierarchical values)

Example: URI of value "panel (birch wood)" in "support" thesaurus:
- [http://rkd.nl/thesaurus/support/panel--birch_wood]














h1. Questions


# Which thesauri do you use?
#- We assume RKD locations, people, concepts, artworks, IconClass, but would like you to confirm explicitly
#- (?) Are these the same well-known RKD thesauri shown in red on the attached picture? Can you tell from the numbers? E.g. People (=RKDartists) having 331,455 {color:#ff0000}{*}Please see my comments above under "Thesauri description". WD, 04-11-2011{*}{color}
!eCulture data cloud.png|width=400!
# (?) Can you *urgently send us the thesauri* used by the Rembrandt project (within 1 week)? Don't' wait to translate them all. Multilingual is not critical for us: even if it's only in Dutch, we'll implement some language-fallback in the UI. Send us the large thesauri like "People", "Locations" etc. Preferably in SKOS if you have them in that format. 2 sample thesauri are received. {color:#ff0000}{*}We will send others. WD, 04-11-2011&nbsp;*{color}
# (?) Is it possible to make an export with the ID's (codes/URIs) of the controlled fields, in addition to the text label? Yes, as you have seen in the sample thesauri. {color:#ff0000}{*}I hope that is what you meant. WD, 04-11-2011{*}{color}
# If not: can you list for each controlled field:
the thesaurus it came from, which branch (parent), whether it’s immediate child or descendant of any depth? We need this info so we can lookup and store the ID.
# Please, is it possible for you to send us thesauri in different languages together in one csv?
# Please confirm, that you agree with the proposed from Vladimir format for URI's (see the above section)?

h3. Thesaurus Properties

There is a properties file to map data fields contents to their thesaurus type and language counterparts

[https://svn.ontotext.com/svn/researchspace/trunk/entity-api/src/resources/thes.properties]







h3. Thesarus Parsing script

The thesaurus parsing python script is no longer an attachment on this page but at

[https://svn.ontotext.com/svn/researchspace/trunk/parseThes/src/parse_thes.py]



{code}
Usage: Usage parse_thes [options] >outfile. 2>error.out

Options:
-h, --help show this help message and exit
-f INPUTFILE, --inputfile=INPUTFILE
filename for thes input data, MANDATORY
-i IGNORELIST, --ignorelist=IGNORELIST
comma separated list of types to ignore, OPTIONAL
-g, --geographic sets the mode to geographic, OPTIONAL
-l LANGUAGE, --language=LANGUAGE
language string such as nl sets a default language for
labels with out a language, OPTIONAL
-v, --verbose generates lots of output to stderr, OPTIONAL
-d DELETELIST, --deletelist=DELETELIST
comma separated list of values to ignore, OPTIONAL
{code}
An example of use:
{code}
./parse_thes.py -g -f ../thes_input_data/Vocabulary\ GEOGRAPHICAL\ THESAURUS.xml -i woonplaats -l nl -d wereld >Vocabulary\ GEOGRAPHICAL\ THESAURUS.ttl
{code}

h4. Validating the output

A useful validator but there is a size limit the thesurus-place.ttl has to chopped up to validate.

[http://www.rdfabout.com/demo/validator/]


h3. Thesarus Turtle files

The thesaurus turtle files (generated by the above) are in svn

[https://svn.ontotext.com/svn/researchspace/trunk/data/thesauri-all.ttl]
[https://svn.ontotext.com/svn/researchspace/trunk/data/thesauri-place.ttl]





These incorporate the resolution of issues RS-170, RS-167, RS-95 and RS-171