"Continuous Access To Cultural Heritage" Programme, ran by The Netherlands Organization for Scientific Research (NWO). In CATCH, computer scientists, the humanities and the cultural heritage sector have set up a unique collaboration with collection managing institutions. Innovation, transferability and cooperation are central issues in this project.
CATCH started in 2004 and a number of subprojects has finished.
CATCHPlus builds on CATCH: the prototypes and demos from eight of the finished projects will be converted to reliable tools. Software that is solid and can be used in multiple institutions. From prototype to reliable application!
- Shared Infrastructure
- Shared Services http://www.catchplus.nl/en/projecten/gemeenschappelijke-diensten/
- OpenSKOS: An online platform for thesauri
- Workspaces: Collaborating in an digital working environment
- Annotation Repository: Sharing and reusing standardised annotations online
- Persistent Identifiers: Confidence in unique and persistent identifiers
- Specific sub-projects http://www.catchplus.nl/en/projecten/deelprojecten/
- SCRATCH4All: Digital Monk deciphers digitised manuscripts
- CHoralPlus: Searching in speech
- WITCHCRAFTplus: Online searching in melodies
- MuSeUMPlus: Improved searching with MuS and Gemééntemuseum. https://github.com/gemeentemuseum/catch
Tries to simplify starting a search engine for collections having the most common search already included and set up. AdLib-oriented
- UPR en ZieOok: Recommendations based on a personal profile
- Multiply: Keyword suggestions based on automatic text analysis
- STITCHPlus: Meaningful searching in existing collections
Actully it's more than that, see below
- DocChecker: Automatic keyword suggestion
Semantic Interoperability To access Cultural Heritage.
Includes demos http://stitch.cs.vu.nl/demo.html
- BnF Mandragore - Illuminated Manuscripts
- Rijksmuseum ARIA - Illuminated manuscripts
- CATCH Vocabulary and Alignment Repository
Providing access to vocabularies and alignments by means of a standardized (SKOS-oriented) web service.
- Rameau subject headings as linked data
Providing access to the main subject vocabulary at the French National Library using SKOS and linked data recipes.
- Event extraction and named entity recognition
Effective use of named entity exctractors and related techniques to identify events in text using a controlled vocabulary. Focus on enriching the metadata for accessibility for the end user.
- Crowdsourcing tool: semantic tagger
Deployed to the end user and is able to set records, websites, anything to identify with a URL / URI to provide a link to a concept of an open data source. The product is a web application using a few lines of code as a plugin can be loaded into any website. The idea is that the tagger is retrieved, with a unique ID (preferably a URL) of the website where the tagger is used.
- Matching concepts with DBPedia
link concepts in the KB thesaurus to to the corresponding entries in DBPedia. Local indexing with Solr / Lucene.
These links can be used to provide the user with deeper info on a subject, and eventually to expand his query into other linked libraries.
- Matching personal names with VIAF Authority File
Used to match personal names from the KB thesaurus to the corresponding persons in the VIAF authority file via opensearch endpoint.
GPL license, written in Ruby, uses a simple scoring scheme based on name, life dates (DOB, DOD).
NL Documentation, google translation
- Access enriched metadata descriptions
A web application that takes a SKOS PPN number, or DBPedia or VIAF URI, and returns extra info in JSON, RDF, or HTML format
- Harvesting mechanism
Two preconfigured harvesters in Java (for stable harvests) and Ruby (for rapid testing).
- Pipeline modules for periodic updates
Update the thesauri used by the STITCH+ project, in RDF/SKOS format, on a periodic basis, via OAI update from OCLC / Pica. This is now part of the production of the IT infrastructure of the KB. The data is available at http://data.kbresearch.nl.
- Local Triple Store
KB thesaurus is loaded in a local triple store. Currently, the thesaurus has 1.2M subjects from different catalogs (eg NBD Biblion, GTT, personal names), making 7.3M triples. Daily updates are harvested and incorporated into this triple store.
- Five vocabularies converted to SKOS
Vocabularies converted to SKOS, including the GTT, Brinkman and Biblion authority, data from GGC. The methods that are used are covered by the STITCH project, there are no new conversions made.
See [Terminology Matching Tools#STITCH@CATCH] for details.
- Enrichment of thesauri
- Harvesting mechanism