How to ensure stability and compliance with linked data principles?
What URIs to use for semantic objects, digital assets, and RS tools (eg data annotation, search, image annotation), in order to ensure:
- stability: if something is rehosted, we don't want its URIs/URLs to change
- compliance with linked data principles (see Techniques and Links)
- compliance with REST principles
We still don't have a strategy how to keep RS URLs stable, yet resolvable across different:
- environments (dev, test, prod),
- states (original data vs newly proposed vs published)
- branding requirements (www.researchspace.org or researchspace.ontotext.com (not) or www.rembrandtdatabase.org, etc)
This page should describe in detail how ResearchSpace allocates and uses URLs and URIs, through examples, with justification
Ontologist and sysadmin to collaborate on this design
|crm:||http://www.cidoc-crm.org/rdfs/cidoc_crm_v5.0.2_english_label.rdfs||CRM RDFS definition (uses no prefix)|
|crm:||http://crm.rkbexplorer.com/id/||BM data. TODO: check in BM SPARQL endpoint, I think they changed it|
- What is the RS home page (eg where you can see the list of projects)? home.researchspace.org or just researchspace.org?
- Don't forget www.researchspace.org is Dominic's site describing the project, and we don't want to get them mixed
- Where is the shared space/graph? Same as home page, or shared.researchspace.org?
- What are project URLs, rembrandt.researchspace.org or researchspace.org/rembrandt?
- How to reflect project data status? In accordance with REST, a good way is this:
rembrandt.researchspace.org/research: where researchers do their work
rembrandt.researchspace.org/published (or just rembrandt.researchspace.org): where the general public views
- What happens when the site is moved from Ontotext dev to prod? By default we'd develop at researchspace.ontotext.com, but how do we move it to researchspace.org? Better to change DNS records, instead of using the Ontotext domain
- How to support both a dev site and a prod site at BM? Ideally dev should be a simple copy of prod, but it must have different URLs, so how to do this trick?
- Do we use Apache access config to all these URLs? How does this mesh with RS access control that is initiated from the DMS?
Oh my, maybe I underestimated the Infrastructure effort.
- when some semantic object (eg a Painting from a museum or a Painter from ULAN) is imported, do we change the URI?
Guess not. But that means our object networks will use a mish-mash of URI prefixes.
- as new knowledge is created in RS (eg annotation or new/changed value), how do we mint URIs?
Do we tack a suffix to the existing foreign URIs, or create brand new in a rs: namespace?
Maybe a mix is appropriate (suffix for small changes, new for "bigger" annotations etc)
- are RS URIs per project (each project has its own prefix), or share the same prefix?
- when a project is published, do the URIs change? (For that matter, when the status of some object or field changes, does its URI change?)
If not, doesn't that conflict with LD principles, since the URL changes from /research to /published?
- DNS record changes to redirect the same hostname (eg home.researchspace.org) to IPs managed by different organizations (eg Ontotext IP while in dev, BM IP while in prod)
- Apache redirections
- Cool URIs for the Semantic Web:
- hash vs slash URIs (may use hash for small ontologies, but must slash for KBs with a lot of instances)
- content negotiation and 301 redirects (serve rdf to machines, html to people)
- Best Practice Recipes for Publishing RDF Vocabularies