Skip to end of metadata
Go to start of metadata

Also see Autocomplete Ranking, Autocomplete Testing

Intro

Compared to the original FORTH definition, FR Implementation has made many fixes, which often result in simpler network diagrams.
But after initial implementation of FRs, we saw various problems and omissions with the FRs themselves.
Because of the complex nature of this topic, all further discussion and decisions should be in this page.
The rest of the page is broken by sections that reflect the status of enhancements. As planning decisions are made, enhancements will be moved between sections.

I have endeavored to collect all related Jira issues as subtasks under RS-1321
and they can be found using parent=RS-1321

BM COL Search

New BM COL search tool: http://www.britishmuseum.org/system_pages/beta_collection_introduction/beta_collection_search_results.aspx
Compare RS search features to COL search. Dominic: I think I prefer the features in the RS search!

Notes on specific aspects:

  • Objects with images -> (to be done)
  • Ethnic Group -> by Actor (done)
  • School Of -> influenced by Actor (to be done)
  • Escapement Type -> is/has/about type (done)

Implemented Enhancements

Mixing Properties

P46B.forms_part_of*|P106B.forms_part_of*|P148B.is_component_of*

is not sufficient to catch all cases because it doesn't allow mixed properties, so it should be reformulated like this:

(P46B.forms_part_of|P106B.forms_part_of|P148B.is_component_of)*

allowing mixed iterations of these properties.

There are many cases like this.

Creation vs Activity

ObjectCreation – (P9B.forms_part_of) (0,n) -> ObjectCreation
The latter should be E7.Activity otherwise you won't catch this:

E63.<Conception of theory of gravitation> P9B.forms_part_of E7.<Newton idling under apple tree> .
E7.<Newton idling under apple tree> P14F.carried_out_by <Newton>.

Event vs Period

BM has use cases "Thing was produced in Period/Culture and in Political State", which are mapped to:

<object/YCA75313/production> 
  P10_falls_within thes:x14625; # Late Christian Period
  P10_falls_within thes:x12345. # Roman Empire

P10 can be used, since Production is a period (Production<Modification<Activity<Event<Period),
and the MatCult/PoliticalState terms are also Periods.

I think this should fall in the FR "Thing has met/is from/was present at Event" (uses P12i_was_present_at and P9i_forms_part_of, and has range E5_Event) which should be generalized to "Thing is from Event/Period".
For this, we have to examine/change several classes/properties:

  • range: I think all these are legitimate: "Thing refers to or is about Late Christian Period", "Thing from Late Christian Period", "Thing destroyed in Late Christian Period", "Thing created in Late Christian Period", "Thing modified in Late Christian Period", "Thing used in Late Christian Period", so half of section "Thing-Event" of FRthing.docx should be changed to speak of E4.Period instead of E5.Event
  • the specific properties (eg created) are all sub-properties of P12, which requires an E5.Event not mere E4.Period.
    But by following the hierarchy (P9i), we can transition from E5 to E4
    Here's the property path: P12-(E5)P9i(E4). Note: P108 implies P12 since P108<P31<P12
  • P9i_forms_part_of or P10_falls_within?
    Various FRs chase the event part hierarchy (P9i) but not the "falls within" hierarchy (P10).
    For the most part this is appropriate since P10 "does not imply any logical connection between the two periods"
    But in this BM case P10 sounds more appropriate to me than P9i: Production "falls within" Late Christianity, but Late Christianity hardly "consists" of production events of Christian objects
  • So overall, I think that the definition "Thing from Event"
    FC70_Thing -- P46B.forms_part_of*|P106B.forms_part_of*|P148B.is_component_of* -> FC70_Thing:
     { FC70_Thing -- P12B.was_present_at-> E5.Event:
        {E5.Event--P9B.forms_part_of* -> 
            E5.Event [--P2F.has_type -> E55.Type] }}
    

    should be changed to "Thing from Event/Period"

    FC70_Thing -- (P46B.forms_part_of|P106B.forms_part_of|P148B.is_component_of)* -> FC70_Thing:
     { FC70_Thing -- P12B.was_present_at-> E5.Event:
        {E5.Period--(P9B.forms_part_of|P10.falls_within)* -> 
            E5.Event [--P2F.has_type -> E55.Type] }}
    

Combinatorial Explosion

RS-705
FR Implementation#BUG

Part Made Of

RS-1066
Should be fixed now

Subject-Iconography

RS-705
Subjects (eg RKD Keywords) and Iconography (eg RKD IconClass) often cannot be interpreted as Person, Place, Event because they represent ideas. Thus they are often interpreted as "types" (concepts).

  • So I included P67_refers_to to FR2_has_type.
  • I also included P128_carries: Physical Thing carries a Conceptual Thing that P67_refers_to a concept.

FR Names and Scope Notes

RS-1006
RS-1032

FR names and scope notes are now controlled by meta-data at FR Names Table (rdfs:label and rdfs:comment respectively).
After edited in confluence, the metadata is regenerated with Emacs then saved to thesaurus-meta.ttl.

Search Result Cutoff

We cut off the search results at 500, because of Exhibit limitations, and since a user isn't likely to explore more objects in the list.
But there's a bug that maybe we cut off earlier:
RS-1453

Autocomplete Performance

RS-939
RS-1037
RS-1198
See page Autocomplete Performance

FTS Leaks

RS-1288
RS-1330
See Investigating FTS Molecules for a description how FTS indexing of one object can leak into another. The last FTS leaks (described at Business Properties) were due to:

  • bibliography references (bibo:Document) not marked as skos:Concept: Josh bug
    Resolved by marking them as skos:Concept, and cutting off at bibo:Document
  • images shared between objects: this is expected
    Resolved by cutting off at FC70_Object

Unexpected FR Consequences

RS-1238

Cannot Reproduce, Invalid, Undecided Issues

RS-1287
RS-1048 : now appears
RS-1360 : returns objects about "clothing"
RS-1488
RS-1489 : What would you propose to do?

Planned Improvements

Owner vs Keeper

RS-1013
RS-1264
RS-1308
RS-1341
RS-1348
RS-1407

  • In the original definition, "Thing from Actor" includes P52_current_owner only. We added P50_has_current_keeper
    P50_has_current_keeper doesn't appear in any FR at present, so "current keeper is Ancient Eygpt and Sudan department" cannot be formulated.
  • "Owner" is not originally defined as a FR. Questions:
    • Should it be only P52_has_current_owner (title holder)?
      If so we don't need a new FR, just to include P52 in the [FR Names Table] with hasRange=E39_Actor.
    • Or maybe we should also include P50_has_current_keeper (custody holder)?
    • How about the historic properties: P51_has_former_or_current_owner respectively P49_has_former_or_current_keeper?
    • How about historic events: E8_Acquisition and P22_transferred_title_to?
      One of the "met" clauses comes in this way.
      In good data a historic event should also be expressed as P51_has_former_or_current_owner, but this is not yet done in BM Association Mapping, because BM codes don't always make it clear whether the other party was owner or not
    • If we include events, how about fancy loops like in "met" in RS-1000@jira?

DECISION:

  • FR "owner/keeper": P52, P50
    DONE: FR52_current_owner_keeper
  • FR "all owners/keepers": P51, P49, and the events E8_Acquisition, E10_Transfer_of_Custody
    DONE: FR51_former_or_current_owner_keeper

By, Met Actor

RS-1000 : Large explanation of by/met
RS-1031

What's the difference between 'met' and 'by'
I am not sure that the 'from' FR is quite right in terms of balancing recall and precision. In partiuclar that it returns objects from people born at a place together with where a thing was created. Rembrandt was born in Leiden and therefore if you say give me objects from leiden you get all of rembrandts works which seems wrong. We need to tease this FR out to have one specifically for objects created by someone from x . It may be that sojme FRs are not as aggregated as others (if at all) but should be orientated towards what users would expect..

DONE

  • "by": produced, produced part, made inscription, modified

DONE: "met" as a very broad relation to Agent: met Actor in the same event, or was involved in acquisition or custody, or was owner/keeper. It includes:

  • "by"
  • "found"
  • "all owners/keepers"
  • TODO: "influenced/motivated" is not yet included

Nationality and Ethnic Group

RS-1236
"By Actor" should find things made by a given Nationality or Ethnic Group.

Nationality
We mapped it to

<author> P107i_is_current_or_former_member_of <nationality>

Eg here's a query to find "Italian" objects:

Examples:

object created transfered title from
CEM313033   Marchese Giovanni Pietro Campana
CGR306654 Cavino, Giovanni  
CME301074 Danese Cattaneo  

Ethnic Group
We remapped it to rso:hasDomain E39_Actor. Here's a query to find such productions.

production ethname ethnic group other examples (tested)
object/EOC122081/production/7 idThes:x84612 "Papua New Guinean" EOC111302, EOC117041 , EOC118123
object/ENA122241/production/8 idThes:x92855 "Northeast Peoples"  
object/EAF122449/production/2 idThes:x83221 "Bantu"  
object/ENA122544/production/2 idThes:x89906 "Eskimo-Aleut"  
object/EAF121643/production/2 idThes:x92855 "Nilotes"  
object/EOC122058/production/5 idThes:x83151 "Indigenous Australian"  
object/RRM192155/production/4 idThes:x83258 "Bedouin"  
object/EAF122191/production/7 idThes:x92478 "Fulbe"  
object/RRM192138/production/4 idThes:x83121 "Armenian"  

Influenced-Motivated

RS-989
RS-1286
Use Cases

  • How to find coins related to Augustus? Coins whose production is Motivated By Emperor Commodus?
    • Term
      <person-institution/57074> person-institution/57074 rdfs:label "Augustus (Octavian)".
      
    • Production Motivated by
      <object/CGR307137/production/4> crm:P17_was_motivated_by <person-institution/57074>;
      <object/CGR307137/production/4/association> bmo:PX_property crm:P17_was_motivated_by;
        crm:P141_assigned <person-institution/57074>;
        crm:P2_has_type <thesauri/authority/K>. # "Authority Assocication K :: Augustus ::",
      

      (Similarly, Emperor Commodus motivated the production of CGR307142)

  • How can I search for "in the Manner/style of :: Constable, John"? (That's an English painter of special interest to Yale)
    • Association code "AL: Manner/Style of" is mapped to Influenced By

Notes:

DECISION:

About (Depicted, Refers to)

RS-1061

  • Example: Depicted (Portrait of)
    <object/CGR307137> 
      crm:P62_depicts <person-institution/57074>.
    <object/CGR307137/image/1> crm:P138_represents <person-institution/57074>;
      crm:P2_has_type <thesauri/association/IP>;
    
  • Aboutness (P62_depicts, P67_refers_to, P129_is_about, P138_represents).
    • We got "Thing refers to or is about Place", but "Thing refers to Actor" is not yet done
    • Define FR67_refers_to_actor based on p44 of TR429
    • Cross-check against FR67_refers_to_or_is_about
    • Roughly maps to BM Association Mapping#Associated Person

DECISION:

  • new FR "about actor"

From Place/Findspot

RS-1049
RS-1282
RS-1057
RS-1283 : "found by" should work for joint finders

Thing with findspot "Qasr Ibrim Nubia" cannot be found. The data is:

<object/YCA79827> crm:P12i_was_present_at <object/YCA79827/find>.
<object/YCA79827/find> crm:P2_has_type <thesauri/find/E>;
  crm:P7_took_place_at <place/x22898>.

Problem: the Find event for an object is not a standard CRM class. BMO defines an extension class bmo:EX_Discovery. The original definition used something made up (C2.Finding) which is not defined anywhere:

  • Thing from Place- FR7_from_place: "without C2.Finding which is undefined"
  • Thing found or acquired at Place- WONTDO: "C2.Finding is not defined"

DECISION:

Thing present at Event or from Period

RS-1235
Material Culture is mapped to crm:E4_Period and should be available through FR "present at" (FR12_was_present_at) having scope note
"Thing was present at (has met, is from) event/period"

<object/YCA75313/production/3>
  crm:P3_has_note "Production Period / Culture :: Late Christian Period";
  crm:P10_falls_within thes:x14625 .

About Event

RS-1445
The "BM Event" thesaurus includes named historic events, eg 'Capture of Olinda de Phernambuco'.
It should be searchable with FR67_about_event and FR12_was_present_at, but currently is not.
The reason is that it hasDomain=E7_Activity, but should be E4_Period (a superclass thereof)

DONE: two FRs:

  • about event (eg historic event)
  • present at Event (eg Exhibition) or from Period (eg Etruscan)

Preferred FRs

RS-988
RS-996

We need the most common relationships to be selected by default in the FR search.
This is currently done for Agent only: 'by' is preferred to 'met'.
But we want a more general solution driven by extra metadata in the FR Names Table:

  • The integer hasOrder determines the order of FRs in the dropdown
  • hasOrder=1 is selected by default (except for the 4 special FRs), see next section)

Keyword-Date-Identifier Search

RS-1284
RS-1408

There are 3 FRs that allow text entry (rso:hasRange=rdfs:Literal): date, identifier and keyword

  • When the user enters a text query in:
    • The top-right search box
    • Or in the search sentence (and does not autocomplete)
  • Trim spaces from the start & end of the query
  • Do smart regexp matching of the text query:
  • Allow switching between these 3 cases
    • Use a dropdown including "keyword", "identified by" and "date interval"
    • Select the matched alternative by default
    • Allow the user to override by selecting "keyword" or "identified by" (but don't allow "date interval" unless it matches the Date Regexp)
  • "keyword" is at the bottom of all FR dropdowns. If the user selects "keyword", the dropdown changes to the same 3 alternatives

Should we remove the Objects/Images/Forums dropdown from top-right? It doesn't work

Identifier Regexp

RS-1477
Can we try to recognize an identifier, and auto-select "identified by" the drop-down (still allowing the user to override)?
Based on analysis of Sample Identifiers:

  • we can't catch ids including spaces (all grcatno and some otherid examples), without risking to misinterpret multiword keyword queries as id queries
  • we want the query to include at least one digit/punctuation, and then maybe some letters
  • if the query has only digits, should be >=5 digits (4 digits is a year).

This query regexp implements the above considerations:

^[a-zA-Z0-9,._-]*?[0-9,._][a-zA-Z0-9,._-]*$

Note: *? is a non-eager qualifier that causes the regexp to match in only one way (reduces backtracking)

Sample Identifiers

(prefix http://collection.britishmuseum.org/id/thesauri/identifier/):

Type Identifier Type Examples
assetid BM Asset Id 69989; 687142
bigno BM Big Number EA99; EA10898; EA10558,14
cmcatno BM Coins & Medals Number BA1p222.1023; BC.1067; EO8p330.1134; MI1p21.76N; MI1p36.154B
codexid BM Codex Id 3494578
grcatno BM Greece & Roman Number Bronze 716; CIL VI 8467; Vase A793.3; Sculpture C159; CIL VI 166 = 30706; CIL XV 6350, 66; CIL VI 2602*; Waywell 1978 368; Old Catalogue No. 123
otherid BM Other number BS.10058; BS.4799.c; BS.6639a; Hay 468; I.Brit.Mus.Gr. I App:7text edition...; Kropp M; P. Ramesseum 13; Sheet.1; S.1272Sheet.10; Sallier 4Sheet.9; Sheet.1P. Ramesseum 3
prn BM Public Reference Number PPA44216; RRM192762; YCA66158
regno BM Registration number S.6223; S.2534; .49; .18192.a; .10466.19; 1977,1105.53; 1920,1115.1.2; 1979,0108.59.A; 1878,1217.123-140; Oc1946,1027.5

Identifier Data References

Sources used for the samples above:

  • GLAM Wiki: BM Refs
  • BM: Help using the Museum number and provenance search
  • manual_assertions.ttl, search for thesIdentifier
  • get identifier types:
    select distinct ?type {
      ?type skos:inScheme <http://collection.britishmuseum.org/id/thesauri/identifier>
    } order by ?type
    
  • get all identifiers, so I can extract regexes out of them
    prefix crm: <http://erlangen-crm.org/current/>
    prefix skos: <http://www.w3.org/2004/02/skos/core#>
    prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    
    select ?type (group_concat(?l) as ?labels) {
      ?id crm:P2_has_type ?type; rdfs:label ?l.
      ?type skos:inScheme <http://collection.britishmuseum.org/id/thesauri/identifier>
    } group by ?type order by ?type
    

Keyword Search of Identifier

  • Keyword search doesn't really work for searching identifiers because of the way the FTS index is setup: it breaks words on punctuation, not only on space.
    • PRN (eg CGR307294) is found fine using keyword search
    • regno (eg 2008,1008.931) was mistaken for a date search (should be fixed now)
    • otherid (eg 74.3.7/1) cannot be found by keyword search

That's why we have a specific FR "identified by", which uses exact match, not FTS query

Objects with Images

Ability to restrict search to objects that have an image.

  • UI: checkbox in front of the search sentence:
    Find Objects  [ ] with images  [current search sentence]
    
    • Dominic: approved (Jana thinks it doesn't fit quite well into the search sentence)
  • Search name: if the flag is set, append " (with images)" at the end

Backend design:
RS-1474

  • Add handling of special FR rso:FR138i_has_representation in SearchObjects method
    (FRSearchRestriction with frUri=rso:FR138i_has_representation and value='true')
  • If requested, add subquery exists(?Images) where ?Images is defined in Search Result Fields
    (note it is different for BM vs RKD, so a UNION query is needed)

Frontend design:
RS-1454

  • add checkbox "with images" (this is the name of rso:FR138i_has_representation)
  • if checked, add rso:FR138i_has_representation to the SearchObjects API call
  • add rso:FR138i_has_representation in Search Serialization (for history) and restore from JSON
    Try to reuse current FRSearchRestriction, by adding frUri=rso:FR138i_has_representation and value true (false=no such FR)
    • if present, add "(with images)" at the end of the search name

Better Autocomplete Ranking

RS-298
RS-1043
See pages Autocomplete Ranking, Autocomplete Testing

Hierarchical Place Facet

RS-1487
The hierarchical place facet in Exhibit sometimes shows parent-child places (eg Easter Island-Rano Raraku) as siblings

Don't loop over place hierarchy for About

RS-1490

Currently FR67_refers_to_or_is_about loops down the place hierarchy (over P89i_contains). This makes for some unintuitive results:

  • EPF112795 is a picture of a lake that depicts Europe -> inferred to be about Europe>Bulgaria>Banya Bulgaria
  • PPA361578 represents Oceania -> inferred to be about Oceania>Polynesia>Easter Island>Rano Raraku

See the discussion in Jira, I think About should not loop over the place hierarchy, neither up (co-variant) nor down (contra-variant)

About vs Keyword

RS-1486 Pending decision by Dominic

  • If I search using "about" I get 191 hits
  • If I search using keyword I get about 94 hits

Notes:

  • "About" loops down the place hierarchy so it incorporates many indirect hits. But I now think it shouldn't, see prev section.
  • Keyword:
    • FTS indexes all text fields, not just controlled terms. Eg "Easter Island" is included in object/EPF109333, although there's no link to the term, since these words are mentioned in bmo:PX_curatorial_comment
    • FTS walks the object graph, so indirect paths also contribute to FTS, eg
      object - P128_carries -> object/image/1 - P138_represents -> Place
    • FTS DOES NOT include broader terms nor altLabels: we thought there's already too many words in the index, and it's not specific enough.
      This means "Easter Island" is not automatically included in an object that mentions "Rano Raraku".

Dominic, please comment:

  1. Should we include broader terms in FTS?
  2. Should we include altLabels in FTS?

Direct Object Load

RS-1478
Use case: I sometimes know the object that I want to load into the tool and going through search simply to get to it is not very efficient. It would be useful to have a quick object number mechanism for loading a record into a tool.

Vlado: this is a combination of:

  • Search FR "Identified by", and use the top-right search box for [Keyword-Date-Identifier Search] asa convenience
  • "if single result, jump into it instead of showing a singleton result list".
    Jana: removing the intermediate step (Exhibit) is >2p/d and seems to be lower business value

Notes:

  • This is useful for all searches, not only Identifier search
  • An Identifier search doesn't necessarily return 1 object (eg regno may be repeated/reused)

Potential Enhancements

These enahncements may be implemented in the future, or maybe never.

Dimension Search

RS-1009
RS-1285
There's no Dimension search in FRs (there's no Dimension FC at all).
Complications include:

  • how to specify the 3 components of a dimension: type, unit, number
  • does it matter what the dimension applies to (not all are at the Object level)
  • conversions between units
  • range search

Grammar-Based Search

RS-1011
RS-1030
FRs collapse the richness of CRM relations into a few select alternatives. Although those are complex widely-reaching networks, they reflect FORTH's understanding about what is interesting.
The current "search sentence" UI allows only simple searches: AND of ORs, regarding Things.

  • FRs are not composable, so one can only combine at the root (Thing). One cannot use "Composite FRs", eg
    things Exhibited by the Metropolitan
    things Exhibited in Sofia
    things Auctioned in Antwerp
  • There are various bugs caused by the pseudo-smartness of this UI, eg
    RS-849
    RS-850
    RS-1276

A search based on Controlled Natural Language (grammar-based) will resolve these problems.

Show Matched Sub-FR in Results

RS-1015
Because the FR is an aggregation it would be useful to identify the matched Sub-FR (actual relationship) in the search results.
So "by Ruffo, Don Antonio". The FR "By" would in the case refer to the fact that Roffo commissioned the painting (rather than painted it or soemthing else). In the summary results this would be great if this was identified clearly. The result would have under it "comissioned by Ruffo, Don Antonio)". I would then be good if you could use this to see what other works were comissioned by Rufus using the result as the search link.

Association Codes Sub-FRs

RS-1697
The association codes would be useful for incorporation into the search system.

As per BM Association Mapping, assoc codes are mapped to P2_has_type, either of:

  • association reification of a specific CRM property, or of
  • specific CRM event type (less often).

BM COL Association Codes

In BM COL, assoc codes can be used as a facet over a selected entity (eg person):

RS Association Codes

For RS the first usage would be to refine the FRs to a hierarchy. Partial example:

Created/Modified
  Produced By
    Produced By - Specific Process
      Drawn
      Printed
    Produced By - Degree of Likelihood
      Probably Produced By
        Attributed to
        Attributed to an Apprentice/Pupil of
      Unlikely Produced By
        Formerly attributed to
        Inscription Created By
    Part Made By
      Bell Made By
      Ebauche Made By
  Modified
    Repaired
Influenced
  • This is related to Show Matched Sub-FR in Results since it would also require some extension of the FR system to expose more than "aggregations of FRs"
  • Also, it would require to build the hierarchical structure above: currently association codes are in a flat skos:inScheme, but we need to put them in separate schemes, or create skos:broader hierarchy. The hierarchy needs to be matched to the appropriate CRM property.
    TODO Vlado: I wrote a longish email to Josh & Dominic around Dec 2012, need to dig it out

Searchable Object Kinds

RS-363
We currently search only for Things (museum objects) and small application objects like annotations & bookmarks.

  • Should we be able to search for people, places, events (such as Exhibitions), what else?
  • Should we have a unified search, or per-object-kind searches?

Vlado:

  • The answer should be substantiated with consideration what data we have about them.
    • In RKD we don't have enough data about these things to create separate searches for them
    • Although Auctions/Exhibitions are not independent objects, we could have a search by sub-object, eg:
      search for Event in the object's lifetime, with given characteristics (Type, Date, Actor..)
    • In BM thesauri there's plenty of data about people (eg life dates, field of activity, short bio), institutions, places.
      So BM authorities provide better scope for this
  • Lists of Objects relating to the thing (eg the inverse of "Thing from Place") could be interesting
  • Adding other search kinds will complicate the search, display of results, and general UI handling
    • Per-kind search could be accommodated by the FR framework: you first select the FC (Thing, Actor, etc), which determines the applicable FRs.
    • Different object kinds have different fields, so result display and faceting of mixed result lists would be a major challenge
    • Even unifying the Search Result Fields of RKD and BM museum objects was no small feat

Dominic:

  • I agree that ResearchSpace will need to deal with dufferent content types. These might be items that are contained in authority files (people for example).
  • It will also need to produce results for other records that have associated metadata including archive records. This may be determined by the URI. So an object as a id/object/ but an archive record may be /id/archive. This would demore a different representation if the user clicks on the result, but this would also be true of different object types that have different metadata (a coin as opposed to a painting). The RKD may have an object record collection.rkd.org/id/object/12344 but we may need to use /id/object/2d/12344 and/or /id/object/archive/23903 which we would know would be different record representations from the RKD.
  • We may need to associate different RForm representations with different domain names and ID types.
    Vlado: yes, we use different RKD and [BM RForm].

Search Scope

RS-1005
Dominic: Search options should allow a user to change the datasets that are being searched: all available datasets, or selected datasets including the project dataset. Eg:

  • All data
  • This project's data
  • RKD data
  • BM data

Defaults can also be set.

Named Graphs or Nested Repositories or Post Filtering

Named Graphs (G in the quad <S,P,O,G>): we have not yet committed what we will use them for, but they can be used for only one purpose.

  • Josh uses one G per object, so he can easily implement data updates (delete old object graph; then reinsert whole object).
  • We could strip Josh's per-object G and use a fixed G on import (RKD vs BM): a non-trivial task
  • Inference cannot be limited per graph. Given a coreferenced term, I cannot define an FR to select or unselect named graphs.

Nested (Virtual) Repositories (as implemented in OWLIM 5.3) is a better choice if we want to limit reasoning to subsets. The administrator would set up several repositories as combinations of real repositories, and the switch will select amongst them.

Post Filtering: is the simplest solution if we're happy to limit to subsets of objects only:

  • mark each object root somehow (eg RKD objects are rso:E22_Museum_Object, and BM objects P50_has_current_keeper the-british-museum)
  • add an implicit search clause to look for the marker corresponding to the selected scope
  • one benefit is that the filtering can be more granular and dynamic, eg
    • "Coins and Medals Department": implement as P50_has_current_keeper <thesauri/department/C>

Project or Data Origin

The original spec speaks about Project Dataspace vs Shared Dataspace, not about Dataset (i.e. origin of the data).

If by "Scope" we mean Data origin, then we'd have these settings:

  • All data
  • British Museum data
  • Rembrandt data

If by "Scope" we mean Project Dataspace, then

  • there will be only 2 settings, called eg:
    • This project's data
    • This project and common data
  • But RS3 does not yet have multiple projects, and that's a big task involving Security and Groups

Other Future Enhancements

TODO Austin's Assessment

Reply to all of these
I followed more or less the test cases on Confluence for Coreferencing searching on RS3.5.

  • Auto-complete and suggestions are great - but there are so many that it is difficult to figure out which to choose. - this feature is helpful especially for the Dutch names but will be frustrating for some users - it would be great if there were an easy way to differntiate immediately between People-Places-Things-etc. Maybe colour codes? or icons?
    Vlado: The kind of term and thesaurus is seen in the notation, eg [BM Place]
  • For example if I type in "Hague" - a list of possible people named Hague comes up, with one Place (BM place/RKD place). When I select "Hague, The" - the search stays at "by Hague, The" rather than "met, Hague, The". When I change the search to "met, Hague, The" - I get no results. I only get results with "keyword".
  • I admit the "met" search took time to figure out - and we will have to find a way of making this simple for users!
  • Also the co-references of the Thesaurus is not always easy - and Rembrandt data is only really searchable with Rembrandt terminology - if I search for keyword "oak" this does not come up with anything as their term is "panel (oak)"
    Vlado: Yep, RS currently has only 1 person and 15 major cities that are coreferenced manually
  • Timeline view only works for RKD data? on BM data I either get a "no results" or "working screen"...
    Vlado: For a long time BM data didn't have proper XSD dates in the appropriate properties. This should work now

TODO Johathan's feedback

There was a very good email by Jonathan (I think), forwarded by Dominic. To be added here, broken into subsections.
Asked amongst other things: "how do you search by bibliography"

TODO Re-collect from Jira

Do another pass through search issues posted by Dominic, move to subtasks of RS-1321@jira, make more subsections

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.