|
Key
This line was removed.
This word was removed. This word was added.
This line was added.
|
Changes (100)
View Page History
This release introduces support for the Elasticsearch GraphDB Connector in a GraphDB Cluster. As the connector works at a lower level than the cluster synchronisation, it requires a transactional entity pool (to ensure entity IDs are consistent within the cluster). The default entity pool is a non-transactional one. Please, refer to [GraphDB-SE Entity Pool] to enable a transactional entity pool.
{note}
Note that GraphDB will does not let you create allow creation of a connector instance, if the wrong entity pool is used.
{note}

{tip:title=What we recommend}
Use the GraphDB Connectors management interface provided by the GraphDB Workbench as it will let lets you create the configuration easily, and then create the connector instance directly or copy the configuration and execute it elsewhere.
{tip}
The create command is triggered by a SPARQL *INSERT* with the *createConnector* predicate, e.g., this will create it creates a connector instance called *my_index* that will synchronise *my_index*, which synchronises the wines from the sample data above:
{div:style=width: 70em}{noformat}

The above command will create creates a new Elasticsearch connector instance that will connect connects to the Elasticsearch instance accessible at port 9300 on the localhost as specified by the "elasticsearchUrl" key.
The "types" key defines the RDF type of the entities to synchronise and, in the example, it is only entities of the type <http://www.ontotext.com/example/wine#Wine> (and its subtypes). The "fields" key defines the mapping from RDF to Elasticsearch. The basic building block is the property chain, i.e., a sequence of RDF properties where the object of each property is the subject of the following property. In the example, we map three bits of information are mapped - the grape the wines are made of, sugar content, and year. Each chain is assigned a short and convenient field name: "grape", "sugar", and "year". The field names are later used in the queries.
Grape is an example of a property chain composed of more than one property. First, we take the wine's madeFromGrape property, the object of which is an instance of the type Grape, and then we take the rdfs:label of this instance. Sugar and year are both composed of a single property that links the value directly to the wine.

h4. Mapping and index management
By default, GraphDB will manage manages (create, delete or update if needed) the Elasticsearch index and the Elasticsearch mapping. This makes it easier to use Elasticsearch as everything will be is done automatically. This behaviour can be changed by the following options:
* _manageIndex_: if true, GraphDB will manage manages the index. True by default.
* _manageMapping_: if true, GraphDB will manage manages the mapping. True by default.
Note that if either of the options is set to false, you will be responsible for creating, updating or removing the index/mapping andif you have misconfigured Elasticsearch, the connector instance will not function correctly.
{note}
Note that if either of the options is set to false, you have to create, update or remove the index/mapping and, in case Elasticsearch is misconfigured, the connector instance will not function correctly.
{note}
Note that if either of the options is set to false, you have to create, update or remove the index/mapping and, in case Elasticsearch is misconfigured, the connector instance will not function correctly.
{note}
h5. Using a non-managed schema

{noformat}{div}
This will create creates the same connector instance as above but it would expect expects fields with the specified fieldnames to be already present in the index mapping, as well as some internal GraphDB fields. For the example, you must have the following fields:
|| field name || Elasticsearch config ||

Dropping a connector instance removes all references to its external store from GraphDB as well as the Elasticsearch index associated with it.
The drop command is triggered by a SPARQL *INSERT* with the *dropConnector* predicate where the name of the connector instance has to be in the subject position, e.g., this will remove removes the connector *:my_index*:
{div:style=width: 70em}{noformat}

{noformat}{div}
h2. Listing available connectors instances
Listing connector instances returns all previously created instances. It is a *SELECT* query with the *listConnectors* predicate:

{noformat}{div}
*?cntUri* will be is bound to the prefixed URI of the connector instance that was used during creation, e.g., <http://www.ontotext.com/connectors/elasticsearch/instance#my_index>, while *?cntStr* will be is bound to a string, representing the part after the prefix, e.g., "my_index".
h2. Instance status check

{noformat}{div}
*?cntUri* will be is bound to the prefixed URI of the connector instance, while *?cntStatus* will be is bound to a string representation of the status of the connector represented by this URI. The status is key-value based.

h2. Adding, updating and deleting data
From the user point of view, all synchronisation will happen happens transparently without using any additional predicates or naming a specific store explicitly, i.e., the user you should simply execute standard SPARQL INSERT/DELETE queries. This is achieved by intercepting all changes in the plugin and determining which abstract documents need to be updated.
h2. Simple queries
Once a connector instance has been created, it will be is possible to query data from it through SPARQL. For each matching abstract document, the connector instance returns the document subject. In its simplest form, querying is achieved by using a *SELECT* and providing the Elasticsearch query as the object of the *:query* predicate:
{div:style=width: 70em}{noformat}

{noformat}{div}
The result will bind ?entity binds *?entity* to the two wines made from grapes that have "cabernet" in their name, namely :Yoyowine and :Franvino.
Note that you must use the field names you chose when you created the connector instance. It is perfectly valid to have field names identical to the property URIs but then you will be responsible for escaping any special characters according to what Elasticsearch expects.
{note}
Note that you should use the field names you chose when you created the connector instance. They can be identical to the property URIs but you should escape any special characters according to what Elasticsearch expects.
{note}
Note that you should use the field names you chose when you created the connector instance. They can be identical to the property URIs but you should escape any special characters according to what Elasticsearch expects.
{note}
First, we get a query instance of the requested connector instance by using the RDF notation "X a Y" (= X rdf:type Y), where X is a variable and Y is a connector instance URI. X will be bound to a query instance of the connector instance. Then we assign a query to the query instance by using the system predicate *:query*. Finally, we request the matching entities through the *:entities* predicate.
# Get a query instance of the requested connector instance by using the RDF notation "X a Y" (= X rdf:type Y), where X is a variable and Y is a connector instance URI. X is bound to a query instance of the connector instance.
# Assign a query to the query instance by using the system predicate :query.
# Request the matching entities through the :entities predicate.
# Assign a query to the query instance by using the system predicate :query.
# Request the matching entities through the :entities predicate.
It is also possible to provide per query search options by using one or more option predicates. The option predicates are described in details below.
h4. Raw queries
If you want to access a Elasticsearch query parameter that is not exposed through a special predicate, you can do it with a raw query. Instead of providing a full text query in the :query part, you specify raw Elasticsearch parameters. For example, if you want to boost some parts of your full text query as described [here|http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_boosting_query_clauses.html], you can use the following query:
To access a Elasticsearch query parameter that is not exposed through a special predicate, use a raw query. Instead of providing a full text query in the :query part, specify raw Elasticsearch parameters. For example, to boost some parts of your full text query as described [here|http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_boosting_query_clauses.html], execute the following query:
{div:style=width: 70em}{noformat}

h3. Combining Elasticsearch results with GraphDB data
The bound ?entity can be used in other SPARQL triples in order to build complex queries that fetch additional data from GraphDB. For GraphDB, for example, to see the actual grapes in the matching wines as well as the year they were made:
{div:style=width: 70em}{noformat}

{noformat}{div}
The result will look looks like this:
|| ?entity || ?grape || ?sugar ||

| :Franvino | :CabernetFranc | 2012 |
{note}
Note that :Franvino is returned twice because it is made from two different grapes, both of which are returned.
{note}
h3. Entity match score
It is possible to access the match score returned by Elasticsearch with the *:score* predicate. As each entity has its own score, the predicate must should come at the entity level. For example:
{div:style=width: 70em}{noformat}

{noformat}{div}
The result will look looks like this but the actual score might be different as it depends on the specific Elasticsearch version:
|| ?entity || ?score ||

h2. Basic facet queries
Consider the sample wine data and the my_index connector instance described previously. We can use the same instance to also query facets:
Consider the sample wine data and the my_index connector instance described previously. You can also query facets using the same instance:
{div:style=width: 70em}{noformat}

{noformat}{div}
It is important to specify the fields we want to facet by using the *facetFields* predicate. Its value must be a simple comma-delimited list of field names. In order to get the faceted results, we have to use the *facets* predicate and, as each facet has three components (name, value and count), the facets predicate binds a blank node, which in turn can be used to access the individual values for each component through the predicates *facetName*, *facetValue*, and *facetCount*.
It is important to specify the facet fields by using the *facetFields* predicate. Its value is a simple comma-delimited list of field names. In order to get the faceted results, use the facets predicate. As each facet has three components (name, value and count), the facets predicate binds a blank node, which in turn can be used to access the individual values for each component through the predicates *facetName*, *facetValue*, and *facetCount*.
The resulting bindings will look like in the table below:
The resulting bindings look like the following:
|| facetName || facetValue || facetCount ||

| sugar | medium | 2 |
We You can easily see that there are three wines produced in 2012 and two in 2013. We You also see that three of the wines are dry, while two are medium. However, it is not necessarily true that the three wines produced in 2012 are the same as the three dry wines as each facet is computed independently.
{tip:title=Faceting of textual fields}
Faceting by analysed textual field will work works but might produce unexpected results. Analysed textual fields are composed of tokens and faceting will use uses each token to create a faceting bucket. For example, "North America" and "Europe" will produced produce three buckets: "north", "america" and "europe", corresponding to each token in the two values. If you need to facet by a textual field and still do full-text search on it, it is best to create a copy of the field with the setting "analyzed":false. For more information, see [#Copy fields].
{tip}

For aggregations, the connector also supports sub-aggregations.
{info}
For more information on each supported facet or aggregation type, please, refer to the documentation of Elasticsearch.
{info}
h3. RDF mapping of the results
The results are accessed through the predicate :aggregations *aggregations* (much like the basic facets are accessed through :facets). *facets*). The predicate will bind binds multiple blank nodes that each contains a single aggregation bucket. The individual bucket items can be accessed through these predicates:
|| predicate || meaning || Elasticsearch counterpart ||

h2. Sorting
It is possible to sort the entities returned by a connector query according to one or more fields. Sorting is achieved by the *orderBy* predicate the value of which must be is a comma-delimited list of fields. Each field may can be prefixed with a minus to indicate sorting in descending order. For example:
{div:style=width: 70em}{noformat}

{noformat}{div}
The result will contain contains wines produced in 2013 sorted according to their sugar content in descending order:
|| entity ||

By default, entities are sorted according to their matching score in descending order.
{note}
Note that if you join the entity from the connector query to other triples stored in GraphDB, GraphDB might scramble the order. To remedy this, use ORDER BY from SPARQL.
{note}
{tip:title=Sorting by textual fields}
Sorting by an analysed textual field will work works but might produce unexpected results. Analysed textual fields are composed of tokens and sorting will use uses the least (in the lexicographical sense) token. For example, "North America" will be sorted before "Europe" because the token "america" is lexicographically smaller than the token "europe". If you need to sort by a textual field and still do full-text search on it, it is best to create a copy of the field with the setting "analyzed":false. For more information, see [#Copy fields].
{tip}

h2. Limit and offset
Limit and offset are supported on the Elasticsearch side of the query. This is achieved through the predicates *limit* and *offset*. Consider this example in which we specify an offset of 1 and a limit of 1 are specified:
{div:style=width: 70em}{noformat}

{noformat}{div}
The result will contain a single wine, Franvino, as it would be second in the list, if we execute the query without the limit and offset:
The result contains a single wine, Franvino. If you execute the query without the limit and offset, Franvino will be second in the list:
|| entity ||

| Blanquito |
Note that the specific order in which GraphDB returns the results, depends on both how Elasticsearch returns the matches, unless you specified sorting.
{note}
Note that the specific order in which GraphDB returns the results depends on how Elasticsearch returns the matches, unless sorting is specified.
{note}
Note that the specific order in which GraphDB returns the results depends on how Elasticsearch returns the matches, unless sorting is specified.
{note}
h2. Snippet extraction
Snippet extraction is used to extract highlighted snippets of text that match the query. The snippets are accessed through the dedicated predicate *:snippets*, which *snippets*. It binds a blank node that in in turn provides the actual snippets via the predicates *:snippetField* and *:snippetText*. The predicate :snippets must be attached to the entity, as each entity has a different set of snippets. For example, in a search for Cabernet:
{div:style=width: 70em}{noformat}

{noformat}{div}
The query will return returns the two wines made from Cabernet Sauvignon or Cabernet Franc grapes as well as the respective matching fields and snippets:
|| ?entity || ?snippetField || ?snippetText ||

| :Franvino | grape | <em>Cabernet</em> Franc |
{note}
Note that the actual snippets might be somewhat different as this depends on the specific Elasticsearch implementation.
{note}
It is possible to tweak how the snippets are collected/composed by using the following option predicates:

h2. Total hits
You can get the total number of hits by using the *:totalHits* predicate, e.g., for the connector instance :my_index and a query that would retrieve retrieves all wines made in 2012:
{div:style=width: 70em}{noformat}

{noformat}{div}
As there are three wines made in 2012, the value 3 (of type xdd:long) will be bound binds to ?totalHits.
h1. List of creation parameters

h3. elasticsearchNode (string), required, Elasticsearch instance to sync to
Since As Elasticsearch is a third-party service, you have to specify the node where it is running. The format of the node value is of the form *hostname.domain:port*. There is no default value.
h3. indexCreateSettings (string), optional, settings for creating the Elasticsearch index
This option will be is passed directly to Elasticsearch when creating the index. It can be in JSON, YAML or properties format.
h3. types (list of URI), required, specifies the types of entities to sync
The RDF types of entities to sync are specified as a list of URIs. At least one type URI must be provided. is required.
h3. languages (list of string), optional, valid languages for literals
RDF data is often multilingual but you may want to can map only some of the languages represented in the literal values. This can be done by specifying a list of language ranges that will to be matched to the language tags of literals according to RFC 4647, Section 3.3.1. Basic Filtering. In addition, an empty range can be used to include literals that have no language tag. The list of language ranges will map maps all existing literals that have matching language tags.
h3. fields (list of field object), required, defines the mapping from RDF to Elasticsearch
The fields define exactly what parts of each entity will be synchronised as well as the specific details on the connector side. The field is the smallest synchronisation unit and it maps a property chain from GraphDB to a field in Elasticsearch. The fields are specified as a list of field objects. At least one field object must be provided. is required. Each field object has further keys that specify details.
h4. fieldName (string), required, name of the field in Elasticsearch

h4. propertyChain (list of URI), required, defines the property chain to reach the value
The property chain (propertyChain) defines the mapping on the GraphDB side. A property chain is defined as a sequence of triples where the entity URI is the subject of the first triple, its object is the subject of the next triple and so on. In this model, a property chain with a single element corresponds to a direct property defined by a single triple. Property chains are specified as a list of URIs and where at least one URI must be provided.
The URI of the document will be synchronised to the special field "id" in Elasticsearch. You may use it to query Elasticsearch directly and retrieve the matching entity URI.

h4. indexed (boolean), optional, default true
If indexed, a field is indexed, it will be available for Elasticsearch queries. True by default.
If true, this option corresponds to "index" = "analyzed" or "not_analyzed". If false, it corresponds to "index" = "no".

h4. multivalued (boolean), optional, default true
RDF propreties properties and synchronised fields may have more than one value. If "multivalued" is set to true, all values will be synchronised to Elasticsearch. If set to false, only a single value will be synchronised. True by default.
h4. datatype (string), optional, manual datatype override
By default, the Elasticsearch GraphDB Connector uses datatype of literal values to determine how they should be mapped to Elasticsearch types. For more information on the supported datatypes, see [#Datatype mapping].
The mapping can be overridden through the property "datatype", which can be specified per field. The value of "datatype" can be any of the xsd: types supported by the automatic mapping or a native Elasticsearch type prefixed by native:, e.g., both xsd:long and native:long map to the long type in Elasticsearch.
h3. Copy fields
Often, it is convenient to synchronise one and the same data multiple times with different settings to accommodate for different use cases, e.g., faceting or sorting vs full-text search. The Elasticsearch GraphDB Connector has explicit support for fields that copy their value from another field. This is achieved by specifying a single element in the property chain of the form @otherFieldName, where otherFieldName is another non-copy field. Take the following example:
By default, the Elasticsearch GraphDB Connector uses datatype of literal values to determine how they should be mapped to Elasticsearch types. For more information on the supported datatypes, see [#Datatype mapping].
The mapping can be overridden through the property "datatype", which can be specified per field. The value of "datatype" can be any of the xsd: types supported by the automatic mapping or a native Elasticsearch type prefixed by native:, e.g., both xsd:long and native:long map to the long type in Elasticsearch.
h3. Copy fields
Often, it is convenient to synchronise one and the same data multiple times with different settings to accommodate for different use cases, e.g., faceting or sorting vs full-text search. The Elasticsearch GraphDB Connector has explicit support for fields that copy their value from another field. This is achieved by specifying a single element in the property chain of the form @otherFieldName, where otherFieldName is another non-copy field. Take the following example:
{div:style=width: 70em}{noformat}
...
...

{noformat}{div}
When we create an analysed field called "grape" and a non-analysed field called "grapeFacet", both fields will be populated with the same values and "grapeFacet" is defined as a copy field that refers to the field "facet".
The snippet creates an analysed field "grape" and a non-analysed field "grapeFacet", both fields are populated with the same values and "grapeFacet" is defined as a copy field that refers to the field "facet".
{note}
Note that the connector handles copy fields in a more optimal way than specifying a field with exactly the same property chain as another field.
{note}
h4. datatype (string), optional, manual datatype override
By default, the Elasticsearch GraphDB Connector will use datatype of literal values to determine how they should be mapped to Elasticsearch types. For more information on the supported datatypes, see [#Datatype mapping].
The mapping can be overriden through the property "datatype", which can be specified per field. The value of "datatype" may be any of the xsd: types supported by the automatic mapping or a native Elasticsearch type prefixed by native:, e.g., both xsd:long and native:long will map to the long type in Elasticsearch.
h3. Copy fields
Often, it is convenient to synchronise one and the same data multiple times with different settings to accomodate for different use cases, e.g., faceting or sorting vs full-text search. The Elasticsearch GraphDB Connector has explicit support for fields that copy their value from another field. This is achieved by specifying a single element in the property chain of the form @otherFieldName, where otherFieldName is another non-copy field. For example, with this snippet:
By default, the Elasticsearch GraphDB Connector will use datatype of literal values to determine how they should be mapped to Elasticsearch types. For more information on the supported datatypes, see [#Datatype mapping].
The mapping can be overriden through the property "datatype", which can be specified per field. The value of "datatype" may be any of the xsd: types supported by the automatic mapping or a native Elasticsearch type prefixed by native:, e.g., both xsd:long and native:long will map to the long type in Elasticsearch.
h3. Copy fields
Often, it is convenient to synchronise one and the same data multiple times with different settings to accomodate for different use cases, e.g., faceting or sorting vs full-text search. The Elasticsearch GraphDB Connector has explicit support for fields that copy their value from another field. This is achieved by specifying a single element in the property chain of the form @otherFieldName, where otherFieldName is another non-copy field. For example, with this snippet:
h1. Datatype mapping
The Elasticsearch GraphDB Connector will map maps different types of RDF values to different types of Elasticsearch values according to the basic type of the RDF value (URI or literal) and the datatype of literals. The autodetection will use uses the following mapping:

| literal | xsd:long | long |
| literal | xsd:int | integer |
| literal | xsd:int | integer |
| literal | xsd:datetTime | date, format = date_optional_time |
| literal | xsd:date | date, format = date_optional_time |
{note}
Note that for any given field the automatic mapping will use uses the first value it sees. This will work works fine for clean datasets but might lead to problems, if your dataset has non-normalised data, e.g., the first value has no datatype but other values have.
{note}
h1. Advanced filtering and fine tuning

h3. entityFilter (string)
The _entityFilter_ parameter is used to fine-tune the set of entities and/or individual values for the configured fields, based on the field value. Entities and field values will be are synchronised to Elasticsearch if, and only if, they pass the filter. The entity filter is similar to a FILTER() inside a SPARQL query but not exactly the same. Each configured field can be referred to, in the entity filter, by prefixing it with a "?", much like referring to a variable in SPARQL. Several operators are supported:
|| Operator || Meaning || Example ||
| ?var in (_value1_, _value2_, ...) | Tests if the field _var_'s value is one of the specified values. Values that do not match will be match, are treated as if they were not present in the repository. | {nf}?status in ("active", "new"){nf} |
| ?var not in (_value1_, _value2_, ...) | The negated version of the in-operator. | {nf}?status not in ("archived"){nf} |
| bound(?var) | Tests if the field _var_ has a valid value. This can be used to make the field compulsory. | bound(?name) |
| bound(?var) | Tests if the field _var_ has a valid value. This can be used to make the field compulsory. | bound(?name) |

h4. Accessing the previous element in the chain
The construction *parent(?var)* can be is used for going to go to a previous level in a property chain. It can be applied recursively as many times as needed, e.g., *parent(parent(parent(?var)))* will go goes back in the chain three times. The effective value of *parent(?var)* can be used with the *in* or *not in* operator like this: {nf}parent(?company) in (<urn:a>, <urn:b>){nf}.
h4. Accessing an element beyond the chain
The construction *?var -> _uri_* (alternatively *?var o _uri_* or just *?var _uri_*) can be is used to access additional values that are accessible through the property _uri_. In essence, this construction corresponds to the triple pattern _value_ _uri_ ?effectiveValue, where ?value is a value bound by the field _var_. The effective value of *?var -> _uri_* can be used with the *in* or *not in* operator like this: {nf}?company -> rdf:type in (<urn:c>, <urn:d>){nf}. It can be combined with *parent()* parent() like this: {nf}parent(?company) -> rdf:type in (<urn:c>, <urn:d>){nf}.
The URI parameter can be a full URI within < > or the special string _rdf:type_ (alternatively just _type_), which will be expanded to http://www.w3.org/1999/02/22-rdf-syntax-ns#type.

h4. Filtering by RDF graph
The construction *graph(?var)* can be is used to access the RDF graph of a field's value. The typical use case is to sync only explicit values: {nf}graph(?a) not in (<http://www.ontotext.com/implicit>){nf}. The construction can be combined with *parent()* like this: {nf}graph(parent(?a)) in (<urn:a>){nf}.
h4. Entity filters and default values

h3. Basic entity filter example
For example, if we create a connector instance such as:
Given the following RDF data:
{div:style=width: 70em}{noformat}
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://www.ontotext.com/example#> .
# the entity bellow will be synchronised because it has a matching value for city: ?city in ("London")
:alpha
rdf:type :gadget ;
:name "John Synced" ;
:city "London" .
# the entity below will not be synchronised because it lacks the property completely: bound(?city)
:beta
rdf:type :gadget ;
:name "Peter Syncfree" .
# the entity below will not be synchronised because it has a different city value:
# ?city in ("London") will remove the value "Liverpool" so bound(?city) will be false
:gamma
rdf:type :gadget ;
:name "Mary Syncless" ;
:city "Liverpool" .
{noformat}{div}
If you create a connector instance such as:
{div:style=width: 70em}{noformat}
@prefix : <http://www.ontotext.com/example#> .
# the entity bellow will be synchronised because it has a matching value for city: ?city in ("London")
:alpha
rdf:type :gadget ;
:name "John Synced" ;
:city "London" .
# the entity below will not be synchronised because it lacks the property completely: bound(?city)
:beta
rdf:type :gadget ;
:name "Peter Syncfree" .
# the entity below will not be synchronised because it has a different city value:
# ?city in ("London") will remove the value "Liverpool" so bound(?city) will be false
:gamma
rdf:type :gadget ;
:name "Mary Syncless" ;
:city "Liverpool" .
{noformat}{div}
If you create a connector instance such as:
{div:style=width: 70em}{noformat}
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>

{noformat}{div}
and then insert some entities:
The entity :beta is not synchronised as it has no value for _city_.
{div:style=width: 70em}{noformat}
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://www.ontotext.com/example#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://www.ontotext.com/example#> .
To handle such cases, you can modify the connector configuration to specify a default value for _city_:
# the entity bellow will be synchronised because it has a matching value for city: ?city in ("London")
:alpha
rdf:type :gadget ;
:name "John Synced" ;
:city "London" .
# the entity below will not be synchronised because it lacks the property completely: bound(?city)
:beta
rdf:type :gadget ;
:name "Peter Syncfree" .
# the entity below will not be synchronised because it has a different city value:
# ?city in ("London") will remove the value "Liverpool" so bound(?city) will be false
:gamma
rdf:type :gadget ;
:name "Mary Syncless" ;
:city "Liverpool" .
{noformat}{div}
We could create the following index to specify a default value for _city_:
:alpha
rdf:type :gadget ;
:name "John Synced" ;
:city "London" .
# the entity below will not be synchronised because it lacks the property completely: bound(?city)
:beta
rdf:type :gadget ;
:name "Peter Syncfree" .
# the entity below will not be synchronised because it has a different city value:
# ?city in ("London") will remove the value "Liverpool" so bound(?city) will be false
:gamma
rdf:type :gadget ;
:name "Mary Syncless" ;
:city "Liverpool" .
{noformat}{div}
We could create the following index to specify a default value for _city_:
{div:style=width: 70em}{noformat}
...
...

{noformat}{div}
The default value will be used for entity:b as it has no value for city in the repository. As the value is "London", the entity will be synchronised.
The default value is used for entity :beta as it has no value for city in the repository. As the value is "London", the entity is synchronised.
h3. Advanced entity filter example
Sometimes data represented in RDF is not well suited to map directly to non-RDF. For example, if we you have news articles and they can be tagged with different concepts (locations, persons, events, etc.), one possible way to model this is a single property :taggedWith. Consider the following RDF data:
{div:style=width: 70em}{noformat}

{noformat}{div}
Now, if we want to you map this data to Elasticsearch so that the property *:taggedWith _x_* is mapped to separate fields *taggedWithPerson* and *taggedWithLocation* according to the type of _x_ (we are not interested in events), we you can map :taggedWith twice to different fields and then use an entity filter to get the desired values:
{div:style=width: 70em}{noformat}

{noformat}{div}
{note}
Note that *type* is the short way to write <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>.
{note}
The six articles in the RDF data above will be mapped as such:

| _sort_firstName | produced, if the option "sort" was true; used implicitly for ordering connector results |
The current version always produces a single Elasticsearch field per field definition in the configuration. This means that you are responsible for creating have to create all appropriate fields based on your needs. See more under [#Creation parameters].
{tip}
