Critical Architecture Topics
- DMS "hosts" UI components, both written by us (Java, maybe JSF) and 3rd party (open-source tools that may even be in different languages).
- So this hosting could be using DMS UI extension mechanisms, but also pure HTML means (eg IFRAMEs, one frame passing data to another... in which case XSRF refusal issues should be considered)
- This will be the primary topic of the Alfresco and Nuxeo pilots in RS3.1 (see RS Plan), and evaluating the ease a primary factor for selecting DMS
- DMS checks credentials and establishes security tokens (readonly/readwrite and which project) that must be passed to RDF and the Tools.
Contrary to popular belief , our DMS-RDF integration requirements are not so big, since all metadata is stored in RDF (the reason being that DMS carries only flat metadata):
- Extract metadata from DMS, save in RDF
- Extract text from DMS docs, copy & index in RDF
After we select the DMS (based mostly on the ease of using DMS as a framework), we'll design the integration.
We should research the approach and results of IKS and Apache Stanbol: integrating to CMS that have JCR, CMIS, or other API; mapping CMS content models to ontologies; APIs to various services (eg FactStore); passing data (JSON-LD)
How do the backend and frontend communicate? Eg if you need to show a painting and all its details, how will the data be represented? There are various options:
- domain object (eg Painting, Painter...) in Java. Pro: concrete easy to grasp implementation. Cons: inflexible, more effort-intensive
- CRM conceptual object (eg Physical Man Made Object, "was created by") in Java. Pro: flexible. Cons: complex, hard to grasp; impedance mismatch with RDF (eg there might be multiple "was created by" properties, do we make an array in Java?)
- graph (subject/property/object) in Java. Pro: faithful to the RDF store, flexible. Cons: frontend devs must interpret all the data, hard to grasp
- RDF store, data is fetched with SPARQL as needed. Pro: easiest for the backend devs . Cons: most work for the frontend devs, involves lots of querying round-trips.
Maybe Exhibit has the right answer: give me JSON, then tell me where to find the fields that interest me. That's a bit like 1 (domain objects), but the objects are not predeclared and are completely dynamic (whatever you put in a JS dictionary)
- the format, described http://docs.api.talis.com/platform-api/output-types/rdf-json
- note that OWLIM and Sesame do not support it
- Mitac found cross-platform tool that converts from/to it, called Raptor http://librdf.org/raptor/
This topic should answer these questions:
- How to integrate
- How to integrate a Tool into the overall system?
It should define the principles, methods and best practices. It should also document individual tool integrations as they happen.
It may or may not involve a separate software artifact (an actual ESB):
- I am wary of going deep into SOA and ESBs: my impression from CollectionSpace is that they went too far that way, which produced sub-optimal choices (eg not using Nuxeo in a core way) and results (IMHO sluggish system)
- we're not building a bank system that moves messages around, but (in one sense) a quilt-work of GUI components
- we'll follow BM's principle that light-weight HTTP/XML/JSON/REST APIs should be preferred
- some deeper operations may require integration between Java components
- some Tools will likely involve integration at the GUI (HTML) level, such as a frame passing data to another