compared with
Current by Neli Hateva
on Apr 20, 2016 18:27.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (6)

View Page History
{panel}This page includes information about the gazetteer PR and the two LRs used by the gazetteer - TRIE Cache and Metadata.{panel}

{toc}

h3. Purpose of the Linked Data Gazetteer Processing Resource

## (optional) {{path2ignorewords}} \- the path to a plain text file containing words that should be ignored when filling the cache;
## (optional) {{path2rules}} \- the path to a groovy source code file containing rules for re-writing a label. The source must contain a workFlow method that returns a set of Strings. The groovy code is used for creating derivatives of a label. When a label is added to the cache, all its derivatives are added as well. For example, let's say you want to lookup the label "Obama, Barak" but in most articles the mention of the name is "Barak Obama". You can use a rule in the groovy code so when you pass "Obama, Barak" you get "Obama, Barak", "Barak Obama" and both labels are added to the cache for matching.
## {{queryFile}} \- URL of a file containing SPARQL queries that return a list of triplets - {{<URI instance, Literal label, URI type>}}. The SPARQL queries in the file are separated by {{@@@}}. They are executed one after another and the result of each is fed to the cache. If there is no file for deserialization in {{path2dic}}, the queries are executed on the SPARQL server defined in {{connectionString}}. SPARQL variables bound to elements of the triplets:
{code}
?concept <-> URI instance;
## (optional) {{updateable}} \- in some cases, after initial loading, the cache should not be available for updates. Set this to 'false', if needed.

Note: The SPARQL endpoint has to be opened or the security has to be off, because there are no parameters for user name and password.

h3. Metadata Language Resource

# Right-click 'Language Resources \-> TRIE Cache for Linked Data Gazetteer' to create a TRIE Cache for the Linked Data Gazetteer Language Resource.
# Set parameter values, refer to the TRIE Cache Parameters section for more information.
# Click 'OK' and the TRIE Cache Language resource will begin preparing its cache by evaluating the queries against the SPARQL endpoint or deserializing existing data.
# (Optional) Right-click 'Language Resources \-> Metadata LR' to create a Metadata Language Resource.
# (Optional) Set parameter values, refer to the Metadata Language Resource Parameters section for more information.
# (Optional) Click 'OK' and the Metadata Language resource will begin preparing its cache by evaluating the queries against the SPARQL endpoint or deserializing existing data.
# Right-click 'Processing Resources \-> Linked Data Gazetteer' to create a Linked Data Gazetteer Processing Resource.
# Open the pipeline where the Linked Data Gazetteer must be added and add the newly created instance at the desired position within the pipeline.