compared with
Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (29)

View Page History
h3. Metadata Language Resource

1) Using the Resource
# Using the Resource: This resource represents a cache structure containing {{(identifier(instance URI, class URI)}} \-> list of semantic metadata features in the {{(identifier(feature value, feature name)) form)}}. It is used by the Linked Data Gazetteer Processing resource to assign additional features to the generated Lookup annotations. The Metadata Language Resource and the TRIE Cache Language Resource are very tightly coupled as they should use common structures to identify the entity instances represented by the Lookup annotations.
# Initialization: This Language Resource has several initialization parameters, two of which are mandatory. Parameters include location of the SPARQL endpoint for cache loading, paths to the shared structures with the TRIE Cache Language Resource and path to a file containing queries used to load the cache.
# Parameters:
## connectionString - SPARQL endpoint for the remote repository where semantic data is stored
## indexFilePath - path to the filesystem location where cache will be serialized
## pathToEntityPoolFolder - the location of the Entity Pool structure, shared with the TRIE Cache Language Resource. The value for this parameter should be exactly the same with the value of the 'path2dic' parameter of the TRIE Cache
## queryFilePath - path to the file with queries to load the resource data

This resource represents a cache structure containing {{(identifier(instance URI, class URI)}} \-> list of semantic metadata features in the {{(identifier(feature value, feature name)) form)}}. It is used by the Linked Data Gazetteer Processing resource to assign additional features to the generated Lookup annotations. The Metadata Language Resource and the TRIE Cache Language Resource are very tightly coupled as they should use common structures to identify the entity instances represented by the Lookup annotations.

2) Initialization

This Language Resource has several initialization parameters, two of which are mandatory. Parameters include location of the SPARQL endpoint for cache loading, paths to the shared structures with the TRIE Cache Language Resource and path to a file containing queries used to load the cache.

3) Parameters
a) connectionString - SPARQL endpoint for the remote repository where semantic data is stored
b) indexFilePath - path to the filesystem location where cache will be serialized
c) pathToEntityPoolFolder - the location of the Entity Pool structure, shared with the TRIE Cache Language Resource. The value for this parameter should be exactly the same with the value of the 'path2dic' parameter of the TRIE Cache
d) queryFilePath - path to the file with queries to load the resource data

h3. Linked Data Gazetteer Processing Resource

1) Using the Resource
# Using the Resource: This Processing Resource runs over the document text and produces Lookup annotations with (optionally) semantic data features. It uses the TRIE Cache Language Resource and (optionally) Metadata Language Resource.
# Runtime parameters:
## cacheLR - this is the TRIE Cache Language Resource that will be used for matching
## (optional) inputAsName - the name of the annotation set where Token annotations are, default is <null>, i.e. the default annotation setting
## (optional) metadataLR - the Metadata Language resource bound to the corresponding TRIE Cache Language resource

This Processing Resource runs over the document text and produces Lookup annotations with (optionally) semantic data features. It uses the TRIE Cache Language Resource and (optionally) Metadata Language Resource.

2) Runtime parameters
a) cacheLR - this is the TRIE Cache Language Resource that will be used for matching
b) (optional) inputAsName - the name of the annotation set where Token annotations are, default is <null>, i.e. the default annotation setting
c) (optional) metadataLR - the Metadata Language resource bound to the corresponding TRIE Cache Language resource

h3. Step-by-step guide for creating and adding a Linked Data Gazetteer Processing Resource into a pipeline

1) # open a GATE Developer instance
2) # (optional) load the GATE application where the Gazetteer PR is to be added
3) # load the gazetteer CREOLE plugin:
a) ## File \-> Manage CREOLE plugins
b) ## click on + button
c) ## type in the directory location (prefixing it with 'file://') or use the 'Select a Directory' button
d) ## click 'OK', check the 'Load Now' option for the newly loaded plugin
e) ## click 'Apply All'
4) # prepare a helper application
a) ## right-click 'Applications' from the left-hand side menu, select 'Create New Application \-> Conditional Corpus Pipeline'
b) ## double click the created pipeline and add an existing tokenization PR into it; if no tokenization Processing Resources are present use the following procedure:
- ### load the ANNIE CREOLE plugin from the 'Manage CREOLE plugins' screen
- ### right-click 'Processing Resources \-> ANNIE English Tokeniser'
- ### add the newly created resource to the helper pipeline
5) # right-click 'Language Resources \-> Trie Cache for Linked Data Gazetteer' to create a TRIE Cache for Linked Data Gazetteer Language Resource
6) # set parameter values, refer to the TRIE Cache Parameters section for more information
7) # click 'OK' and the TRIE Cache Language resource should begin preparing its cache by evaluating the queries or deserializing existing data
8) # (optional) right-click 'Language Resources \-> Metadata LR' to create a Metadata Language Resource
9) # (optional) set parameter values, refer to the Metadata Language Resource Parameters section for more information
10) # (optional) click 'OK' and the Metadata Language resource should begin preparing its cache by evaluating the queries or deserializing existing data
11) # right-click 'Processing Resources \-> Linked Data Gazetteer' to create a Linked Data Gazetteer Processing Resource
12) # open the pipeline where the Linked Data Gazetteer should be added and add the newly created instance at the desired position within the pipeline
13) # select the Linked Data Gazetteer PR and set its runtime parameters, refer to the Linked Data Gazetteer Processing Resource Runtime parameters section