Skip to end of metadata
Go to start of metadata

Intro

  • as part of RS 20120502 Mellon Demo, they wanted to see some query working across different servers
  • we'll be able to provide only tabular data as a demo, is this sufficient
  • (too bad Yale's deployed OWLIM only now, they wanted a very similar thing, and maybe the idea comes from them)
  • BarryN helped with SPARQL Federation syntax
  • will be executed on the RS endpoint, since SPARQL Federation (SERVICE) doesn't allow to specify user/password, but the RS endpoint has such

Potential Endpoints

name interactive query SERVICE notes
ResearchSpace http://researchspace.ontotext.com/openrdf-workbench/repositories/susana/query http://researchspace.ontotext.com/openrdf-sesame/repositories/susana
Yale http://collection.britishart.yale.edu/openrdf-workbench/repositories/owlim/query http://collection.britishart.yale.edu/openrdf-sesame/repositories/owlim Has no artist names, ulan:nnn don't really correspond to ULAN IDs, doesn't seem to have
BritishMuseum http://collection.britishmuseum.org/Sparql unknown "Syntax=SparqlResults/XML&Query=..." gives error Invalid parameters.
Without Syntax= gives An error occured with retrieving the results of the query
DBpedia/FactForge http://factforge.net/sparql http://factforge.net/sparql distinguishes by Accept header, eg
curl -H "Accept:application/sparql-results+json" _
  "http://factforge.net/sparql?query=SELECT+*+{%3Fs+%3Fp+%3Fo}+LIMIT+10"
  • the current plan is to use RS and DBpedia.

Test Queries

Vital data from DBpedia

(on RS) Rembrandt's vital data, obtained from DBpedia

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbp-prop: <http://dbpedia.org/property/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT * WHERE {
 SERVICE <http://factforge.net/sparql> {
    ?act a foaf:Person.
    ?act rdfs:label "Rembrandt"@en.
    OPTIONAL {?act dbp-prop:birthdate  ?birthDate}
    OPTIONAL {?act dbp-prop:deathdate  ?deathDate}
    OPTIONAL {?act dbp-prop:birthplace ?birthPlace. FILTER(lang(?birthPlace)="en")}
    OPTIONAL {?act dbp-prop:deathplace ?deathPlace. FILTER(lang(?deathPlace)="en")}
}}

Things in RS

(on RS) thing, creator, name
(uses SPARQL 1.1 property path syntax)

PREFIX crm: <http://erlangen-crm.org/current/>
select ?obj ?act ?name {
  ?obj crm:P108i_was_produced_by / crm:P14_carried_out_by ?act.
  ?act crm:P131_is_identified_by / crm:P3_has_note ?name.
}

Paintings in RS

(on RS) painting, title, creator (of any part), name

PREFIX crm: <http://erlangen-crm.org/current/>
select ?obj ?title ?act ?name {
  ?part crm:P108i_was_produced_by / crm:P14_carried_out_by ?act.
  ?part crm:P46i_forms_part_of ?obj.
  ?obj crm:P102_has_title / crm:P3_has_note ?title.
  ?act crm:P131_is_identified_by / crm:P3_has_note ?name.
}

Rembrandt paintings in DBpedia

(on FF) Rembrandt paintings, with their EN title

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select * {
<http://dbpedia.org/resource/Rembrandt>
  <http://rdf.freebase.com/ns/visual_art/visual_artist/artworks> ?work.
  ?work rdfs:label ?title.
  FILTER(lang(?title)="en")}
  • Note1: the range of <http://dbpedia.org/property/works> is string not URI of the painting?!
  • PREFIX fb: <http://rdf.freebase.com/ns/> is not useful since the property uses slashes, so fb:visual_art/visual_artist/artworks is not valid syntax. (FF wrongly suggests the property is visual_art.visual_artist.artworks)

Demo Queries

Things from RS, together with vital data from DBpedia

Painting authors from RS, add vital data (date/place of birth/death) from DBpedia

  • The only matches (from RDF Search and Explore) are:
  • so we have to add some sameAs assertions:
    PREFIX owl: <http://www.w3.org/2002/07/owl#>
    PREFIX dbpedia: <http://dbpedia.org/resource/>
    PREFIX rkd-artist: <http://rkd.nl/thesaurus/artist/>
    INSERT DATA {
      dbpedia:Rembrandt owl:sameAs rkd-artist:Rembrandt.
      dbpedia:George_Hendrik_Breitner owl:sameAs rkd-artist:Breitner_George_Hendrik.
      dbpedia:Jacobus_Houbraken owl:sameAs rkd-artist:Houbraken_Jacob.
    }
    
  • then query:
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX dbp-prop: <http://dbpedia.org/property/>
    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    PREFIX crm: <http://erlangen-crm.org/current/>
    select * {
      ?obj crm:P108i_was_produced_by / crm:P14_carried_out_by ?act.
      ?act crm:P131_is_identified_by / crm:P3_has_note ?name.
      SERVICE <http://factforge.net/sparql> {
        ?act a foaf:Person.
        OPTIONAL {?act dbp-prop:birthdate  ?birthDate}
        OPTIONAL {?act dbp-prop:deathdate  ?deathDate}
        OPTIONAL {?act dbp-prop:birthplace ?birthPlace. FILTER(lang(?birthPlace)="en")}
        OPTIONAL {?act dbp-prop:deathplace ?deathPlace. FILTER(lang(?deathPlace)="en")}
    }}
    
  • we can also get all alternative names from DBPedia, but that's too much (eg Rembrandt has 6-7)
    OPTIONAL {?act rdfs:label ?DBPname. FILTER(lang(?DBPname)="en")}.

Things from the RS and Yale collections (UNION)

(on RS) thing, title

PREFIX crm: <http://erlangen-crm.org/current/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?thing ?title {
  {?thing crm:P102_has_title / crm:P3_has_note ?title}
   UNION
   {SERVICE <http://collection.britishart.yale.edu/openrdf-sesame/repositories/owlim>
     {?thing crm:P102_has_title ?t. ?t rdfs:label ?title .
      FILTER (?title != "") }}}

There are some bugs with the federated query parser, so keep the above just as it is

  • property path syntax (crm:P102_has_title / rdfs:label) doesn't work
  • deleting the space after (?title != "") doesn't work

Vlado's Presentation

You need flash player installed to preview ppt and pdf files

Get Adobe Flash player

Dominic's Site

http://annotate.oldman.me.uk/federate.html
Dominic Oldman - April 2012

Semantic Federated Search - Test developed by Yale Center for British Art, the British Museum and Ontotext

This web page retrieves results from two different decentralised databases with a single federated search query. There is no middleware or program code to connect the two data sources. One database contains data provided from the RKD's Rembrandt Project and is located in Sofia, Bulgaria. The other store belongs to the Yale Center for British Art at Yale University (YCBA), New Haven in the United States. This web site is located in Amsterdam. This test is part of work being undertaken by the British Museum and YCBA. Additional RDF stores will be added and functionality developed over time.

The main features of the test are:

  1. No middleware is used for the search. There is one query searching two separately located data stores.
  2. The data is semantically harmonised simply by conforming to the CIDOC-CRM ontology. Both sets of data agree about the meaning of certain concepts.
  3. There has been no hardcoded links between the datasets. They have been created separately and no RDF triples have been inserted to assert any direct connections.
  4. The search illustrates that both sets of data agree about the meaning of a painting title, even though these concepts may have been implemented differently within the respective organisations.
  5. The web page has been tested with Firefox but may work with later versons of Internet Explorer.
  6. British Museum data will be included in short course.
  7. There are no database optimisations. The stores have only recently be installed.

Abstract: A data harmonistation without any middleware. Without any hard coded links and based on a common semantic understand on what constitutes a title for a painting. In this context not very complicated but it shows the principle that can be applied. The tools in ResearchSpace are about, one the one hand improving information and knoweledge and on the other hand, by in doing so, making the connections between the data much stronger and based on scholalrly annotation as well as catalogue and scientific information.
The academy and the museum are combined!

Function:

  1. There is a drop down with some keywords
  2. You select a word and the keyword is injected into an AJAX call to a federated search (python CGI script0 across the titles from the Rembrandt semantic database and the yale semantic database. This is based on the CRM understanding of a title.
  3. The web page retrieves the data from http://www.rembrandtdatabase.org in Bulgaria and http://collection.britishart.yale.edu at Yale.
  4. The images are sourced on the basis of these onject identifiers (although more artificially in terms of Bulgaria)
  5. You can clearly see the result of the keyword search.

Reading material

There is currently some development going on in this area. Please find a short overview below.
First, there is Sesame. We are currently working on an integration of the SPARQL 1.1 federation extensions into core Sesame. A release of the new version is planned in around 2-3 weeks.
Then there is FedX, a federation SAIL for Sesame with sophisticated optimizations for federated query processing. Allthough FedX does not yet support SPARQL 1.1, it is capable of executing SPARQL queries against a federation of pre-configured SPARQL endpoints. Note that FedX automatically performs source selection (on the registered sources), and thus does not need the endpoints specified explicitely via SERVICE by the user. FedX provides a command line interface to allow for fast interaction, and an integration of SPARQL 1.1 is scheduled for the next few weeks.
Besides these projects there are some research projects dealing with federated query processing: SPLENDID (UKoblenz, paper at COLD 2011), SPARQL DQP (UPM) and DARQ (distributed ARQ, discontinued 2006).

  • Vlado: TODO: add references to projects presented at ESWC 2012 query federation workshop
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.