comes from AdLib collection management software and is better structured
- see [^Rembrandt Database 2-23-09.pdf] for a description of this project
- We got 2 sample XMLs (a record list and an individual record about a painting. Still don't have the schema
- tags are in Dutch but often there are English comments following; and a dictionary will help understanding
- many values are bilingual
20110917 Vladimir to Bert Degenhart Drenth firstname.lastname@example.org cc BM, SSL, Mariana
We're starting work on the www.ResearchSpace.org project for the British Museum.
One of the tasks is to convert the Rembrandt project database to CIDOC CRM and import it to ResearchSpace.
Dominic Oldman (IS Development Manager of BM) sent us some sample records from Rembrandt (attached), and I noticed they are exported from your system: the root is <adlibXML>.
So I have some questions:
- are you involved in the Rembrandt project, or they only use your system?
- could you send us the schema of the Rembrandt XML export?
- would it be possible to export using English tags? (We don't speak Dutch). There are English XML comments in the file (very useful) but the conversion will be more robust if the tags are in English.
- does AdLib have conversion/export to some standard format (e.g. LIDO) that can be used by Rembrandt? It would be higher value to BM if we develop an import from a standard format.
Thanks in advance for your cooperation and best regards!
I've simplified the XML sample record to a table for easier comprehension, by using (all hail!) Emacs and these commands:
|M-%||query-replace||</.*?>||remove closing tags|
|C-u C-x C-o||my-delete-blank-lines||remove empty lines|
|M-%||query-replace||><||> <||add a space here: <empty-tag/><!-- English comment>|
|M-=||query-replace-regexp||^(TAB*)||\,(format "%dTAB%s" (length \1) (make-string (length \1) ?-))||replace leading tabs with N (tag level) and leading dashes (to conserve space while still showing the hierarchy)|
|M-=||query-replace-regexp||>(.*)(<!-- (.*) -->)||> \3TAB\1||move English comment right after the tag, put element content in new column|
|M-=||query-replace-regexp||>([^ ])||>TAB\1||put remaining element contents in new column|
I've reduced the sample by leaving only 1 instance of each element. In many cases I merged elements so the remaining one has all possible sub-elements (which may lead to non-sensical data, eg begindatum_in_collectie>einddatum_in_collectie).
This can be used as the frame on which to discuss and later describe the CRM mapping