View Source

{excerpt}Open Annotation Collaboration (OAC) at{excerpt}

h1. Intro
OAC is a Mellon-funded project to develop the OAC ontology. Principal participants:
- Los Alamos National Laboratory
- The University of Queensland
- University of Maryland
- University of Illinois at Urbana Champaign
- Maryland Institute for Technology in the Humanities

- Phase1: Jul 2009-Dec 2010. [Final report|] (Apr 2011)
- Phase2: [started Nov 2011|]

h2. Resources
- [Guiding Principles|] (latest revised version)
- [Beta Data Model Guide|], 10 August 2011
-- [Constraint Type List|]
- [oac.rdf|] ontology, saved to SVN
The one served by the namespace URI [] is old (alpha3)
- Examples:
-- [Hubble|]: images from the Hubble telescope
-- [Scholarly Editions|]
-- [Gatsby|]: commentaries on The Great Gatsby by F.Scott Fitzgerald
-- [CNN|]
- [Google Group|!forum/oac-discuss]
I've studied most of it
- [Reading list|] (recent papers)
I haven't read any of them yet (about 11)
- [Use cases|] and [User narratives|]

h3. Other resources
- list of [Annotation tools|], esp. [Image annotation|]
- another list of [Annotation tools|]

h2. Benefits
- Based on general Guiding Principles that hopefully will ensure longevity, sustainability and relevance to various domains
- Treats Annotation, Target (item being annotated) and Body (item comprising the annotation) as separate and distinct
- Provides for annotations of text, structured data, images, other multimedia
- Provides examples of application in widely varying domains, with great illustrations

h2. Disadvantages
- the model is still beta, not finalized
- recommends storing RDF data as an encoded string (cnt:chars) (sec [3.5|], [|], [3.9.1|]), which IMHO is a very bad idea
- does not yet define any way to publish, share, search, discover, subscribe to annotations:
-- to transfer: "Dereferencing the HTTP URI of the Annotation document results in an RDF serialization of an instance of this data model. Any of the RDF serialization formats are permissible, however RDF/XML is recommended"
-- "[companion publish/discovery document|]" (not yet available)"
-- "In order to increase the likelihood of adoption, and in alignment with the goal of sharing annotations, no client-server protocol for publishing/updating/deleting annotations will be specified. Rather, the specifications will take a perspective whereby clients publish annotations to the Web and make them discoverable using common Web approaches"

h2. Implementations
Annotation implementations conforming to OAC:
- [Ajax XML Encoder|] was the first app to do OAC on a pilot basis
-- big name: “The MITHological AXE: Multimedia Metadata Encoding with the Ajax XML Encoder”
-- [live demo|]: includes text, image, audio, video. Not quite impressive
-- [Zotero AXE demo video|]: integration of AXE to annotate image captured in Zotero
- [Shared Canvas|Image Annotation Tools#Shared Canvas]
- [SEMLIB|Image Annotation Tools#SEMLIB]
- [YUMA|Image Annotation Tools#RDF and OAC export]

h3. In development
[Partner projects|]
- [Annotation Supporting Collaborative Development of Scholarly Editions|]
- [Annotation and Transcription Tools for Digital Medieval Manuscripts|]
- [VideoAnnotator|]: in development, open source available
- [Annotation of Digital Emblematica|]
- (i) [Maphub Phase II|]: Web portal for annotating online historic maps. ("Georeferencing and annotating digitized, high-resolution historic maps"). All annotations will be represented in the OAC data model and become Web resources that can be referred to.
Created by Cornell, builds upon YUMA, built using Ruby on Rails. Open source.
- [Annotator for Fedora|]: developed at Brown University, not yet available

Watch [Projects|] and pages for future announcements

h1. OAC Overview
OAC resume/cheat sheet

h2. [Baseline Model|]
An oac:Annotation consists of oac:Body (sometimes called "content") and oac:Target (the resource being annotated).
||oac:hasBody|from Annotation to Body|
||oac:hasTarget|from Annotation to Target|
- There can be [Multiple Targets|], eg this Body relates to 3 Targets:
Fitzgerald makes key mistakes in accounting for Daisy's chronology:
-- We are told at one point in Chapter 4, that...
-- We also know from elsewhere in Chapter 4, that ...
-- And we are told at one point in Chapter 1, that ...

h2. [Additional Properties|]
||dc:title|name of the Annotation|
||dcterms:creator|creator of the Annotation|
||dcterms:created|time and date at which the Annotation was created|
||oac:annotates|from Body to Target|

h2. [Annotation Predicates|]
Unofficial proposal for various annotation "dispositions" that are sub-properties of oac:annotates, eg:
- advises: advisesPreferredEdition advisesSupportingReference advisesFurtherResearch
- categorizes: categorizesBySubject categorizesByTemporal categorizesBySpatial
- changes: changesAddsErrata changesSuggestsEdit
- comments:
- compares: orders contrasts correlates
- explains
- questions: questionsFact questionsOpinion questionsConsistency
- replies: repliesAgreement repliesDisagreement repliesMixedOpinion
- seeAlso: seeAlsoOtherSources seeAlsoConfirming

h2. [Annotation Type|]
Defines specialized sub-classes of oac:Annotation. We also use one: rso:DataAnnotation.
Currently there is only one (the additional [Annotation Type List|] is empty)
||oac:Reply|an Annotation (A2) that it is a reply to another Annotation (A1). A2.hasTarget=A1|
- Problem: A2 has no direct pointer to the ultimate target (eg image), nor to the root of the discussion thread.
(A W3C proposal for thread structure includes such pointer: [].)
So if you need to find all annotations in a thread, or having the same ultimate target, you need to walk a chain.

h2. Serialization, Inline
"Dereferencing the HTTP URI of the Annotation document results in an RDF serialization of OAC".
- This follows Linked Data principles
- It's not entirely clear how much of the annotation graph should be returned. The included [example|] returns only [oac:Body|], using RDFa->RDF conversion

Inline [Body|], [Data Annotations|], [Constraints|]
Uses the W3C spec [Representing Content in RDF|] to put encoded data in a node (Body, Constraint):
|| cnt:ContentAsText | signifies that content is available as encoded data |
|| cnt:chars | representation of the resource, as plain text, in a certain encoding |
|| cnt:characterEncoding | character encoding of cnt:chars (or cnt:bytes?), such as "utf-8" or "ascii" |
|| dc:format | data format of the body, eg "N3" for RDF/N3;Turtle "SVG" for a SvgConstraint |
I have argued that [Representing RDF data as cnt:chars is a bad idea|!topic/oac-discuss/nOyWZKPKlew] since it defeats semantic repository indexes and query optimization

h2. Fragments and Constraints
Fragments and Constraints allow one to select a resource *part* as annotation Body and/or Target, not just the entire resource. This important topic accounts for about half the spec
- Section in the main [Data Model|]
- Extended [Constraint Type List|]

h3. Fragments
Fragments are denoted as "URI#Fragment" and include:
- text: *char=* start,end; *line=* start,end
- X/HTML: *id* or **; *xpointer=*
- PDF: *page=* n; *viewrect=* top,left,width,height
- image: *xywh=* top,left,width,height
- video: *t=* kind:start,end
-- where: kind: Normal Play Time (*npt*), SMPTE (*smpte*), or Wall Clock (*clock*)
-- start,end: are in seconds

||dcterms:isPartOf|links the fragment Target to the full resource. Typically:
<URI#fragment> dcterms:isPartOf <URI> |

h3. Constraints
Constraints represent the resource part in a structured way. Some reuse [AO Selectors|] ( aos: ). They are represented as subclasses of oac:Constraint and include:
- text: oac:OffsetRangeConstraint: aos:offset (starting char), aos:range (number of chars)
- text: oac:PrefixPostfixConstraint: aos:prefix (before selection), aos:exact (the selection), aos:postfix (after selection)
- [image|]: oac:SvgConstraint: part is delimited by an SVG shape: path, rect, circle, ellipse, line, polyline, polygon, g(roup)
-- g(roup) is used only to create a part with a hole (eg donut)
-- point (marker) cannot be used?
-- the SVG content is obtained thus:
if the oac:SvgConstraint node is also cnt:ContentAsText, then from property cnt:chars
else the node's URI is dereferenced and the server should return the content
- time: oac:WebTimeConstraint: oac:when: to annotate the version of a resource as of the given time.
(When you cite a URL in a scientific paper, good style asks you to say "Accessed on ...", this is the same idea)
- context: oac:ContextConstraint: oac:inContextOf: to annotate a resource only in the context of a bigger resource.
Eg "this image is too bright in the context of that web page"
- [rdf|]: constraint-specific predicates that specify the resource part in a structured way

h4. RDF Resource Constraint
I made a proposal for [OAC constraint extension|!topic/oac-discuss/TW2Z60Qpek0] to annotate an RDF property-instance:
RDF Resource Constraints are used to point to RDF statements. They match the framework of [Data Model Guide section|] and are attached to the Constraint (node C-1 on Figure 7.3.2).
They reuse the RDF Reification vocabulary.
||rdf:Statement|Signifies that the constraint is about an RDF statement. The statement conceptually "belongs" to the target T-1 (the object of oac:constrains).|
||rdf:subject|Subject of the statement being annotated, required. May coincide with the target T-1.|
||rdf:predicate|Property being annotated, required.|
||rdf:object|Object of the statement being annotated, optional. If missing, the annotation is about all statements with the given subject and predicate, and none of them in particular|
However, I have now come up with a cleaner representation (see below)

h1. RS Data Annotation in OAC
The previous version of RS [Annotation Design|Annotation Design] used a vocabulary based on CRM and extended with RSO.
We have now decided to use OAC, and the correspondence is shown below.
The key to understanding is this: [Annotation Points] map exactly to oac:Target (target of a link!).
There are two cases: entire MO or a Statement

h2. Case 1: Annotate Entire MO
The target is rso:E22_Museum_Object (the entire MO)
| *RSO+CRM* | *OAC+Reification* | *meaning; comments* |
| crm: E13_Attribute_Assignment | rso:DataAnnotation | annotation |
| rso:root | oac:hasTarget | link to target (being the MO) |
| | oac:Target,rso:E22_Museum_Object | annotation target (being the MO) |
| | oac:hasBody | link to body |
| | oac:Body | annotation body |

h2. Case 2: Annotate statement
The target is rdf:Statement with optional rdf:object. The target uses dcterms:isPartOf (as in [#Fragments]) to point to the entire MO
| *RSO+CRM* | *OAC+Reification* | *meaning; comments* |
| crm: E13_Attribute_Assignment | rso:DataAnnotation | annotation |
| | oac:hasTarget | link to target |
| | oac:Target,rdf:Statement | annotation target (being a statement) |
| rso:root | dcterms:partOf | Museum Object being annotated |
| crm: P140_assigned_attribute_to | rdf:subject | subject of statement being annotated |
| rso:property | rdf:predicate | property-instance being annotated |
| rso:object | rdf:object | object/value annotated/proposed |
| rso:other_object | same | old object/value criticised/justified |

h2. Common properties
The rest of the properties apply to both cases:
| *RSO+CRM* | *OAC+Reification* | *meaning; comments* |
| | rso:DataAnnotation| Won't use [#Annotation Type] *oac:Reply*: see problem described there |
| rso:reply_to | same | point to original oac:Annotation; keep oac:hasTarget pointin towards MO |
| | oac:hasBody | link to body |
| | oac:Body | annotation body |
| rso:has_link | same | |
| rso:P3_has_title | same | title. Make subproperty of dcterms:title |
| rso:P3_has_description | same | description |
| rso:P2_reply_disposition | same | Won't use [#Annotation Predicates] because they are only proposed, not part of the spec |
| rso:P2_other_disposition | same | |
| rso:P2_annotation_status | same | |
| crm: P14_carried_out_by | dcterms:creator | creator |
| crm: P4_has_time-span. P82a_begin_of_the_begin | dcterms:created | date/time created |

h2. Graphical Comparison
A graphical example of Case2 (annotation of Statement) is shown below, comparing the old and new way

h3. Annotation with RSO and CRM

h3. Annotation with OAC and Reification

h2. Diff Comparison

h2. TODO
- decide on URI scheme. In particular, will the AP (rdf:Statement) have nuxeo:uid or hand-crafted URI based on its components