[AnnoCultor|http://annocultor.eu/] started out as "Just Code it in Java", Mitac's favored approach :-) It is a Java application used by VU for the eCulture Pilot. [Papers|http://annocultor.eu/pubs.html] (local copies):
- [Data Migration and Ingestion Tools^Porting Cultural Repositories to the Semantic Web (2008).pdf]: 12p. Shows frequency of use and distribution of conversion Rules (methods) by group and dataset (p.7). Has some estimates on effort (p.10 sec 4.4).
- [Data Migration and Ingestion Tools^Semantic Excavation of the City of Books (2007).pdf]: 8p
- [Data Migration and Ingestion Tools^Thesaurus and Metadata Alignment for a Semantic E-Culture Application (2007).pdf]: 2p
AnnoCultor has grown into a fully-fledged conversion framework that allows converting databases and XML files to RDF, and semantically tag them with links to vocabularies, to be published on Linked Data and the Semantic Web. Consists of:
- AnnoCultor Converter: converts SQL databases, XML files, and SPARQL datasets to RDF. Converters are written in XML in a simple declarative way, and common XML editing skills are sufficient to write one.
- AnnoCultor Tagger: allows assigning semantic tags (terms from existing vocabularies) to your data. Recently used to semantically tag nearly 7 million records from the Europeana collections with location data
- AnnoCultor Time Ontology: a vocabulary of time periods: milleniums, centuries, half centuries, quarters, decades, and years. It also includes historical periods, like middle ages.
(Last time I looked, there was no viable time periods ontology, maybe that's changed)
- it's a bit unclear which is the latest release..
- [http://annocultor.svn.sourceforge.net/viewvc/annocultor/trunk/src/main/java/eu/annocultor/converters/time/OntologyToHtmlGenerator.java] is updated most recently (7 weeks)
- most of the other files are a year old
It implements the following conversion Rules:
- Create constant
- Rename resource property
- Rename literal property
- Replace value
- Lookup person
- Lookup place
- Lookup term
- Facet rename property
- Use value of other path
- Use other subject
h1. CRM Mapping
An approach by Doerr and company for mapping anything to CRM.
This is all about slicing, dicing and combining source paths to target paths.
Got from CRM site: [crm_mappings|http://www.cidoc-crm.org/crm_mappings.html], [tools|http://www.cidoc-crm.org/tools.html].
- [Data Migration and Ingestion Tools^Mapping format for data structures to CRM (2001).pdf]: older paper
- [Data Migration and Ingestion Tools^Mapping a Data Structure to the CIDOC Conceptual Reference Model (2002).pdf]: presentation
- [Data Migration and Ingestion Tools^Mapping Language for Information Integration (FORTH TR385 Dec2006).pdf]: paper describing the process
Proposes a conceptual structure for describing mappings, in the form of XML schema:
Tool versions (please note that older releases may have stuff that is not in earlier releases\!)
- Saved to [\\ontonas\all-onto\Projects\culture\mapping\CRM-mapping]
- [CidocXML2RDFv6|http://www.cidoc-crm.org/downloads/CidocXML2RDFv6.rar] (Sep 2011): includes CRMdig ontology for Digitalization (3D COFORM)
- [MappingTool (v 1.1)|http://www.cidoc-crm.org/downloads/MappingTool(XML2RDF-DataTransformation)(v%201.1).zip] (Oct 2010): includes GUI tool for mapping (!) , generates XML of mapping (?) , implements the mapping process (?)
- [mapping_tool_4_12_02|http://www.cidoc-crm.org/docs/mapping_tool_4_12_02.zip] (Apr 2002)
Open source ETL framework.
- Used extensively by Onto's LifeSci group. They also develop custom components for RDF output, semantic annotation, etc
- Proposed for use by FP7 SME [Bids:Linked City]
- Used by UC Berkeley for a [CollectionSpace] deployment
Vaso and Deyan Peychev swear by it. Includes:
- GUI for creating and Framework for executing complex ETL flows, with exception catching, document routing, etc
- GUI for creating mappings, eg from XML to RDF.
I'll talk to Deyan whether the TalenD mapper can implement CRM path slicing & dicing
- [Talend:] space (now open to SSL and SirmaITT)
- [LIFESKIM:Talend] intro, [Tutorial|Talend:Tutorial - Talend Semantic ETL 0.2] (these may be already merged into the above space)
[MINT|http://mint.image.ece.ntua.gr/redmine/projects/mint/wiki] is a data conversion toolkit used by numerous projects (Athena, Judaica etc) to contribute to Europeana.
Nice graphical mapper, nice demo movie etc
- The source of Europeana conversion tools was donated
- a group of Europeana core developers founded Delving ([http://www.delving.eu]), to continue developing the source and offer professional services
- open source on github: [https://github.com/delving/delving]
- "currently being adopted by a wide variety of Cultural Heritage Institutions across Europe. The development and the current feature set would not have been possible without [the support and contributions by these organisations|http://www.delving.eu/the-delving-platform/origins-and-support]"
The platform includes:
- Aggregator toolkit, including SIP Creator
SIP means "Submission Information Package", which is a name for the metadata sets that Europeana ingests through OAI-PMH
-- Graphical Metadata Mapping and storage Tool (may be useful for Rembrandt & Cranach data conversion)
Can source any XML. Generates the "obvious" mappings. Scriptable using Groovy and a simple domain-specific-language for splitting and joining fields, etc. [Excellent movie|http://www.delving.eu/the-delving-platform/aggregator-toolkit]
-- Metadata Repository accessed via OAI-PMH (Open Archives Initiative - Protocol for Metadata Harvesting)
-- Integrated Solr/Lucene Search Engine, Open-Search API
-- Persistent Identifier Management
-- Dynamic Thumbnail Caching
-- Source XML Data Analyser
-- Data Set Upload and Remote Management
- Web Portal
Make digital objects discoverable by the rest of the world through a web portal with powerful search features. The portal has an integrated multilingual Content Management System and interface support for 28 European languages and role-based User Management.
-- Web Portal
--- Simple & Advanced search
--- Summary Views as Grid or List
--- Facet-based result drill-down
--- Related Items
--- Detailed Metadata Result View
Interface Support for 28 European Languages
-- User Management
--- Create and manage users
--- Custom views based on user-roles
-- Integrated CMS
--- Create pages and custom content
--- Upload images
--- Create custom menus
--- Simple versioning system
-- Annotation Component for Objects, Images, Movies and Maps (donated by Austrian Institute of Technology, full integration planned in 2011)
CultureCloud is a sort of transnational Europeana aggregator that wants to do a lot of things that would be important for us.
E.g. terminology mapping, cross-linking of objects, crowd-sourcing (many people being able to edit), Europeana publishing...
- Don’t yet seem to have a web-site.
- Applied under FP7 PSP (call finished Jul 31), we'll see if it gets funding.
- Found about it from the CIDOC conference that's just over: [http://www.brukenthalmuseum.eu/cidoc/uk/file/abstracts.pdf] p.11
Karma is a Data Integration Tool by USC. It enables users to quickly and easily integrate data from a variety of data sources including databases, spreadsheets, delimited text files, XML, JSON, KML and Web APIs to RDF.
- [http://www.isi.edu/integration/karma/]: the Karma website is very informative, including papers and videos.
It describes applications to biosciences, cultural heritage (Smithsonian), geo mashups, web APIs (eg Twitter).