compared with
Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (1)

View Page History
| Lon | 1151 | 2666 | 20075 | 4740 | 5146 |
| London | 448 | 1937 | 20606 | 2832 | 4426 |

h1. LuceneDirect vs. Lucene/Owlim
- implemented Lucene direct indexing (using LuceneAPI). The index is created from the existing thesauri terms and has the following fields:
|| Field || Description || STORE/INDEX options ||
| CONTENT | all rdf:label-s | Store.NO, Index.ANALYZED |
| RANK | rdfRank from OWLIM | Store.YES, Index.NO; Used in the final lucene ordering |
| THES | List of the thesauri to which the current item belongs, separated by space | Store.YES, Index.NO |
| LABEL | prefLabel, previously computed by the rule (en, none, nl) - cached in this field | Store.YES, Index.NO |
| SCOPENOTE | The scope note, previously computed - cached in this field | Store.YES, Index.NO|
| URI | The URI of the item |Store.YES, Index.NO|
- the indexing takes ~30mins, including the execution of the Owlim queries
- the performance of the search varies from 22% (for easier searches, like "Poorter" above to 8800% for "Lon*"). See the spreadsheet for details
- tested that the top results of the current search and the new search match (up to certain point at least; e.g. for 100 or more results we cannot expect the order to be the same - just for the items with a non-zero rank)
- there are two differences, for "rem*" and for "ams*" where the LuceneDirect search results 8 and 1 less results, respectively. This is probably related to some of the searcher options in Lucene and seems OK