CES Components

Overview

This page provides information about all the components required to build a resilient Concept Extraction API with dynamically updated Gazetteer dictionaries. In case you only need to be able to extract named entities from text with a static dictionary and don't care about high availability, you can do it with a single worker.

Worker

Configuration

General

GATE

Recommended JVM settings

Coordinator

Configuration

All timeouts are in milliseconds unless specified otherwise.

General

GraphDB

Workers

Updates (dictionaries)

Updates (models)

Annotation

Watchdog / heartbeat checker

Files

All files relative to ~/.coordinator/[${coordinator.name}]/ , that is ~/.coordinator if coordinator.name is unset and
~/.coordinator/<coordinator.name>/ if it is set

JVM settings

GraphDB and EUF plug-in

This is the semantic database you are going to need to enable the dynamic dictionary updates functionality. In case you don't already have GraphDB, go get it here. Official 6.0 documentation.

EUF stands for 'Entity Updates Feed'. This plug-in publishes entity update feeds which are consumed by the Coordinator.

Configuration

To install the EUF plug-in in GraphDB

  1. Provide the following Java parameter to GraphDB on startup
    -Dregister-external-plugins=/your/plugins/home
  2. Unpack the EUF plug-in in your plugins home (prior to starting GraphDB)