- Overview and features
- Setup and maintenance
- Creating a connector instance
- Dropping a connector instance
- Listing available connectors instances
- Instance status check
- Working with data
- Adding, updating and deleting data
- Simple queries
- Basic facet queries
- Advanced facet and aggregation queries
- Limit and offset
- Snippet extraction
- Total hits
- List of creation parameters
- Datatype mapping
- Advanced filtering and fine tuning
- Overview of connector predicates
- Migrating from a pre-6.2 version
The GraphDB Connectors provide extremely fast normal and faceted (aggregation) searches, typically implemented by an external component or a service such as Solr but have the additional benefit of staying automatically up-to-date with the GraphDB repository data.
The Connectors provide synchronisation at the entity level, where an entity is defined as having a unique identifier (a URI) and a set of properties and property values. In terms of RDF, this corresponds to a set of triples that have the same subject. In addition to simple properties (defined by a single triple), the Connectors support property chains. A property chain is defined as a sequence of triples where each triple's object is the subject of the following triple.
The main features of the GraphDB Connectors are:
- maintaining an index that is always in sync with the data stored in GraphDB;
- multiple independent instances per repository;
- the entities for synchronisation are defined by:
- a list of fields (on the Solr side) and property chains (on the GraphDB side) whose values will be synchronised;
- a list of rdf:type's of the entities for synchronisation;
- a list of languages for synchronisation (the default is all languages);
- additional filtering by property and value.
- full-text search using native Solr queries;
- snippet extraction: highlighting of search terms in the search result;
- faceted search;
- sorting by any preconfigured field;
- paging of results using offset and limit;
- custom mapping of RDF types to Solr types;
Each feature is described in detail below.
All interactions with the Solr GraphDB Connector shall be done through SPARQL queries.
There are three types of SPARQL queries:
- INSERT for creating and deleting connector instances;
- SELECT for listing connector instances and querying their configuration parameters;
- INSERT/SELECT for storing and querying data as part of the normal GraphDB data workflow.
In general, this corresponds to INSERT adds or modifies data and SELECT queries existing data.
Each connector implementation defines its own URI prefix to distinguish it from other connectors. For the Solr GraphDB Connector, this is http://www.ontotext.com/connectors/solr#. Each command or predicate executed by the connector uses this prefix, e.g., http://www.ontotext.com/connectors/solr#createConnector to create a connector instance for Solr.
Individual instances of a connector are distinguished by unique names that are also URIs. They have their own prefix to avoid clashing with any of the command predicates. For Solr, the instance prefix is http://www.ontotext.com/connectors/solr/instance#.
All examples use the following sample data, which describes five fictitious wines: Yoyowine, Franvino, Noirette, Blanquito and Rozova as well as the grape varieties required to make these wines. The minimum required ruleset level in GraphDB is RDFS.
This release introduces support for the Solr GraphDB Connector in a GraphDB Cluster. As the connector works at a lower level than the cluster synchronisation, it requires a transactional entity pool (to ensure entity IDs are consistent within the cluster). The default entity pool is a non-transactional one. Please, refer to GraphDB-SE Entity Pool to enable a transactional entity pool.
Note that GraphDB will not let you create a connector instance, if the wrong entity pool is used.
In order to be able to create new Solr cores on the fly, you have to use the custom admin handler provided with the Solr Connector. These are the necessary steps:
- Copy the file solr-core-admin-handler.jar from the folder tools of the GraphDB distribution to your Solr home.
- Tell Solr to scan the jar and use our custom handler instead of the default one. Add this to the root solr tag in solr.xml in your Solr home:
|Note that this is a limitation of Solr and you are not required to use the custom handler. If you do not wish to deploy it, you will be responsible for creating the Solr core.|
To use the connector, the core's schema from which we will copy the configuration (most of the time named collection1) should be configured to allow schema modifications. See managed and mutable in the Solr documentation).
A good starting point is the configuration from example-schemaless in the Solr distribution.
Creating a connector instance is done by sending a SPARQL query with the following configuration data:
- the name of the connector instance (e.g., my_index);
- a Solr instance to synchronise to;
- classes to synchronise;
- properties to synchronise.
The configuration data has to be provided as a JSON string representation and passed together with the create command.
|What we recommend|
Use the GraphDB Connectors management interface provided by the GraphDB Workbench as it will let you create the configuration easily, and then create the connector instance directly or copy the configuration and execute it elsewhere.
The create command is triggered by a SPARQL INSERT with the createConnector predicate, e.g., this will create a connector instance called my_index that will synchronise the wines from the sample data above:
Note that one of the fields has "multivalued": false. This is explained further under Sorting.
The above command creates a new Solr connector instance that connects to the Solr instance accessible at port 8983 on the localhost as specified by the "solrUrl" key.
The "types" key defines the RDF type of the entities to synchronise and, in the example, it is only entities of the type <http://www.ontotext.com/example/wine#Wine> (and its subtypes). The "fields" key defines the mapping from RDF to Solr. The basic building block is the property chain, i.e., a sequence of RDF properties where the object of each property is the subject of the following property. In the example, we map three bits of information - the grape the wines are made of, sugar content, and year. Each chain is assigned a short and convenient field name: "grape", "sugar", and "year". The field names are later used in the queries.
Grape is an example of a property chain composed of more than one property. First, we take the wine's madeFromGrape property, the object of which is an instance of the type Grape, and then we take the rdfs:label of this instance. Sugar and year are both composed of a single property that links the value directly to the wine.
By default, GraphDB will manage (create, delete or update, if needed) the Solr core and the Solr schema. This makes it easier to use Solr as everything will be done automatically. This behaviour can be changed by the following options:
- manageCore: if true, GraphDB will manage the core. True by default.
- manageSchema: if true, GraphDB will manage the schema. True by default.
The automatic core management requires the custom Solr admin handler provided with the GraphDB distribution. For more information, see Solr core creation.
Note that if either of the options is set to false, you will be responsible for creating, updating or removing the core/schema and, if you have misconfigured Solr, the connector instance will not function correctly.
The present version provides no support for changing some advanced options, such as stopwords, on a per field basis. The recommended way to do this for now is to manage the schema yourself and tell the connector to just sync the object values in the appropriate fields. Here is an example:
This will create the same connector instance as above but it would expect fields with the specified fieldnames to be already present in the core, as well as some internal GraphDB fields. For the example, you must have the following fields:
|field name||Solr config|
|_graphdb_id||<field name="_graphdb_id" type="string" indexed="true" stored="true" required="true" multiValued="false"/>|
|_chains||<field name="_chains" type="string" indexed="true" stored="false" required="false" multiValued="false"/>|
|grape||<field name="grape" type="text_general" indexed="true" stored="true" multiValued="true"/>|
|sugar||<field name="sugar" type="text_general" indexed="true" stored="true" multiValued="false"/>|
|year||<field name="year" type="tints" indexed="true" stored="true" multiValued="true"/>|
_graphdb_id and _chains are used internally by GraphDB and are always required.
Dropping a connector instance removes all references to its external store from GraphDB as well as the Solr core associated with it.
The drop command is triggered by a SPARQL INSERT with the dropConnector predicate where the name of the connector instance has to be in the subject position, e.g., this will remove the connector :my_index:
Listing connector instances returns all previously created instances. It is a SELECT query with the listConnectors predicate:
?cntUri will be bound to the prefixed URI of the connector instance that was used during creation, e.g., <http://www.ontotext.com/connectors/solr/instance#my_index>, while ?cntStr will be bound to a string, representing the part after the prefix, e.g., "my_index".
The internal state of each connector instance can be queried using a SELECT query and the connectorStatus predicate:
?cntUri will be bound to the prefixed URI of the connector instance, while ?cntStatus will be bound to a string representation of the status of the connector represented by this URI. The status is key-value based.
From the user point of view all synchronisation will happen transparently without using any additional predicates or naming a specific store explicitly, i.e., the user should simply execute standard SPARQL INSERT/DELETE queries. This is achieved by intercepting all changes in the plugin and determining which abstract documents need to be updated.
Once a connector instance has been created, it will be possible to query data from it through SPARQL. For each matching abstract document, the connector instance returns the document subject. In its simplest form, querying is achieved by using a SELECT and providing the Solr query as the object of the :query predicate:
The result will bind ?entity to the two wines made from grapes that have "cabernet" in their name, namely :Yoyowine and :Franvino.
Note that you must use the field names you chose when you created the connector instance. It is perfectly valid to have field names identical to the property URIs but then you will be responsible for escaping any special characters according to what Solr expects.
First, we get a query instance of the requested connector instance by using the RDF notation "X a Y" (= X rdf:type Y), where X is a variable and Y is a connector instance URI. X will be bound to a query instance of the connector instance. Then we assign a query to the query instance by using the system predicate :query. Finally, we request the matching entities through the :entities predicate.
It is also possible to provide per query search options by using one or more option predicates. The option predicates are described in details below.
If you want to access a Solr query parameter that is not exposed through a special predicate, you can do it with a raw query. Instead of providing a full text query in the :query part, you specify raw Solr parameters. For example, if you want to sort the facets in a different order than described in facet.sort, you can execute the following query:
You can get these parameters when you do your query from the admin interface in Solr, or from the response payload (where they are included). We also support the query parameters from the select endpoint in Solr, if you prefer that. Here is an example:
Note that you have to specify q= as the first parameter because we use it for detecting the raw query.
The bound ?entity can be used in other SPARQL triples in order to build complex queries that fetch additional data from GraphDB. For example, to see the actual grapes in the matching wines as well as the year they were made:
The result will look like this:
Note that :Franvino is returned twice because it is made from two different grapes, both of which are returned.
It is possible to access the match score returned by Solr with the :score predicate. As each entity has its own score, the predicate must come at the entity level. For example:
The result will look like this but the actual score might be different as it depends on the specific Solr version:
Consider the sample wine data and the my_index connector instance described previously. We can use the same instance to also query facets:
It is important to specify the fields we want to facet by using the facetFields predicate. Its value must be a simple comma-delimited list of field names. In order to get the faceted results, we have to use the facets predicate and, as each facet has three components (name, value and count), the facets predicate binds a blank node, which in turn can be used to access the individual values for each component through the predicates facetName, facetValue, and facetCount.
The resulting bindings will look like in the table below:
We can easily see that there are three wines produced in 2012 and two in 2013. We also see that three of the wines are dry, while two are medium. However, it is not necessarily true that the three wines produced in 2012 are the same as the three dry wines as each facet is computed independently.
|Faceting of textual fields|
Faceting by analysed textual field will work but might produce unexpected results. Analysed textual fields are composed of tokens and faceting will use each token to create a faceting bucket. For example, "North America" and "Europe" will produced three buckets: "north", "america" and "europe", corresponding to each token in the two values. If you need to facet by a textual field and still do full-text search on it, it is best to create a copy of the field with the setting "analyzed":false. For more information, see Copy fields.
While basic faceting allows for simple counting of documents based on the discrete values of a particular field, there are more complex faceted or aggregation searches in Solr. The Solr GraphDB Connector provides a mapping from Solr results to RDF results but no mechanism for specifying the queries other than executing a raw query.
The Solr GraphDB Connector supports mapping of range, interval and pivot facets. For more information, please, refer to the documentation of Solr.
The results are accessed through the predicate :aggregations (much like the basic facets are accessed through :facets). The predicate will bind multiple blank nodes that each contain a single aggregation bucket. The individual bucket items can be accessed through these predicates:
|:key||Key or value associated with the bucket||getValue() or getKey()|
|:count||Count of documents in the bucket||getCount()|
|:from||Start of range (RangeFacet)||getStart()|
|:to||End of range (RangeFacet)||getEnd()|
|:rangeGap||Gap of range (RangeFacet)||getGap()|
|:beforeCount||Count of documents before the first range (RangeFacet)||getBefore()|
|:afterCount||Count of documents after the first range (RangeFacet)||getAfter()|
|:betweenCount||Count of documents within all ranges (RangeFacet)||getBetween()|
|:parent||Pivot facets: points to the parent (upper level) blank node|
|:level||Pivot facets: level number where 1 is the uppermost level and the following levels are 2, 3 and so on|
|:levelName||Pivot facets: level name||getField()|
It is possible to sort the entities returned by a connector query according to one or more fields. Sorting is achieved by the orderBy predicate the value of which must be a comma-delimited list of fields. Each field may be prefixed with a minus to indicate sorting in descending order. For example:
The result will contain wines produced in 2013 sorted according to their sugar content in descending order:
By default, entities are sorted according to their matching score in descending order.
Note that if you join the entity from the connector query to other triples stored in GraphDB, GraphDB might scramble the order. To remedy this, use ORDER BY from SPARQL.
|Sorting by textual fields|
Sorting by an analysed textual field will work but might produce unexpected results. Analysed textual fields are composed of tokens and sorting will use the least (in the lexicographical sense) token. For example, "North America" will be sorted before "Europe" because the token "america" is lexicographically smaller than the token "europe". If you need to sort by a textual field and still do full-text search on it, it is best to create a copy of the field with the setting "analyzed":false. For more information, see Copy fields.
Solr imposes an additional requirement on fields used for sorting. They must be defined with multivalued = false.
Limit and offset are supported on the Solr side of the query. This is achieved through the predicates limit and offset. Consider this example in which we specify an offset of 1 and a limit of 1:
The result will contain a single wine, Franvino, as it would be second in the list, if we execute the query without the limit and offset:
Note that the specific order in which GraphDB returns the results, depends on both how Solr returns the matches, unless you specified sorting.
Snippet extraction is used to extract highlighted snippets of text that match the query. The snippets are accessed through the dedicated predicate :snippets, which binds a blank node that in in turn provides the actual snippets via the predicates :snippetField and :snippetText. The predicate :snippets must be attached to the entity, as each entity has a different set of snippets. For example, in a search for Cabernet:
The query will return the two wines made from Cabernet Sauvignon or Cabernet Franc grapes as well as the respective matching fields and snippets:
Note that the actual snippets might be somewhat different as this depends on the specific Solr implementation.
It is possible to tweak how the snippets are collected/composed by using the following option predicates:
- :snippetSize sets the maximum size of the extracted text fragment, 250 by default;
- :snippetSpanOpen text to insert before the highlighted text, <em> by default;
- :snippetSpanClose text to insert after the highlighted text, </em> by default.
The option predicates are set on the query instance, much like the :query predicate.
You can get the total number of hits by using the :totalHits predicate, e.g., for the connector instance :my_index and a query that would retrieve all wines made in 2012:
As there are three wines made in 2012, the value 3 (of type xdd:long) will be bound to ?totalHits.
The creation parameters define how a connector instance is created by the :createConnector predicate. Some are required and some are optional. All parameters are provided together in a JSON object, where the parameter names are the object keys. Parameter values may be simple JSON values such as a string or a boolean, or they can be lists or objects.
All of the creation parameters can also be set conveniently from the Create Connector user interface in the GraphDB Workbench without any knowledge of JSON.
Since Solr is a third-party service, you have to specify the URL on which it is running. The format of the URL is of the form *http://hostname.domain:port/*. There is no default value.
The RDF types of entities to sync are specified as a list of URIs. At least one type URI must be provided.
RDF data is often multilingual but you may want to map only some of the languages represented in the literal values. This can be done by specifying a list of language ranges that will be matched to the language tags of literals according to RFC 4647, Section 3.3.1. Basic Filtering. In addition, an empty range can be used to include literals that have no language tag. The list of language ranges will map all existing literals that have matching language tags.
The fields define exactly what parts of each entity will be synchronised as well as the specific details on the connector side. The field is the smallest synchronisation unit and it maps a property chain from GraphDB to a field in Solr. The fields are specified as a list of field objects. At least one field object must be provided. Each field object has further keys that specify details.
The name of the field defines the mapping on the connector side. It is specified by the key fieldName with a string value. The field name is used at query time to refer to the field. There are few restrictions on the allowed characters in a field name but to avoid unnecessary escaping (which depends on how Solr parses its queries), we recommend to keep the field names simple.
The property chain (propertyChain) defines the mapping on the GraphDB side. A property chain is defined as a sequence of triples where the entity URI is the subject of the first triple, its object is the subject of the next triple and so on. In this model, a property chain with a single element corresponds to a direct property defined by a single triple. Property chains are specified as a list of URIs and at least one URI must be provided.
The URI of the document will be synchronised to the special field "id" in Solr. You may use it to query Solr directly and retrieve the matching entity URI.
See Copy fields for defining multiple fields with the same property chain.
The default value (defaultValue) provides means for specifying a default value for the field when the property chain has no matching values in GraphDB. The default value can be a plain literal, a literal with a datatype (xsd: prefix supported), a literal with language, or a URI. It has no default value.
If a field is indexed, it will be available for Solr queries. True by default.
This options corresponds to the property "indexed" in the Solr schema.
Fields can be stored in Solr and this is controlled by the Boolean option "stored". Stored fields are required for retrieving snippets. True by default.
This option corresponds to the property "stored" in the Solr schema.
When literal fields are indexed in Solr, they will be analysed according to the analyser settings. Should you require that a given field is not analysed, you may use "analyzed". This option has no effect for URIs (they are never analysed). True by default.
This option affects the Solr type that will be used for the field. True will use a type suitable for the values (i.e., text or numeric), while false will use the type "string", which is never analysed by Solr.
RDF propreties and synchronised fields may have more than one value. If "multivalued" is set to true, all values will be synchronised to Solr. If set to false, only a single value will be synchronised. True by default.
This option corresponds to the "multiValued" property in the Solr schema. Note that Solr cannot order results by multivalued fields so you need to adjust your options accordingly.
When we create an analysed field called "grape" and a non-analysed field called "grapeFacet", both fields will be populated with the same values and "grapeFacet" is defined as a copy field that refers to the field "facet".
Note that the connector handles copy fields in a more optimal way than specifying a field with exactly the same property chain as another field.
By default, the Solr GraphDB Connector will use datatype of literal values to determine how they should be mapped to Solr types. For more information on the supported datatypes, see Datatype mapping.
The mapping can be overriden through the property "datatype", which can be specified per field. The value of "datatype" may be any of the xsd: types supported by the automatic mapping or a native Solr type prefixed by native:, e.g., both xsd:long and native:tlongs will map to the tlongs type in Solr.
Often, it is convenient to synchronise one and the same data multiple times with different settings to accomodate for different use cases, e.g., faceting or sorting vs full-text search. The Solr GraphDB Connector has explicit support for fields that copy their value from another field. This is achieved by specifying a single element in the property chain of the form @otherFieldName, where otherFieldName is another non-copy field. For example, with this snippet:
The Solr GraphDB Connector will map different types of RDF values to different types of Solr values according to the basic type of the RDF value (URI or literal) and the datatype of literals. The autodetection will use the following mapping:
|RDF value||RDF datatype||Solr type|
|literal||none||text_general or text_xx where xx is language dependent|
The datatype mapping can be affected by the synchronisation options, too. For example, a non-analysed field that has xsd:long values will not use "tlongs" but "string" instead.
Note that for any given field the automatic mapping will use the first value it sees. This will work fine for clean datasets but might lead to problems, if your dataset has non-normalised data, e.g., the first value has no datatype but other values have.
The entityFilter parameter is used to fine-tune the set of entities and/or individual values for the configured fields, based on the field value. Entities and field values will be synchronised to Solr if, and only if, they pass the filter. The entity filter is similar to a FILTER() inside a SPARQL query but not exactly the same. Each configured field can be referred to in the entity filter by prefixing it with a "?", much like referring to a variable in SPARQL. Several operators are supported:
|?var in (value1, value2, ...)||Tests if the field var's value is one of the specified values. Values that do not match will be treated as if they were not present in the repository.||?status in ("active", "new")|
|?var not in (value1, value2, ...)||The negated version of the in-operator.||?status not in ("archived")|
|bound(?var)||Tests if the field var has a valid value. This can be used to make the field compulsory.||bound(?name)|
|expr1 || expr2||Logical disjunction of expressions expr1 and expr2.||bound(?name) || bound(?company)|
|expr1 && expr2||Logical conjunction of expressions expr1 and expr2.||bound(?status) && ?status in ("active", "new")|
|!expr||Logical negation of expression expr.||!bound(?company)|
|( expr )||Grouping of expressions||(bound(?name) || bound(?company)) && bound(?address)|
In addition to the operators, there are some constructions that can be used to write filters based not on the values but on values related to them:
The construction parent(?var) can be used to go to a previous level in a property chain. It can be applied recursively as many times as needed, e.g., parent(parent(parent(?var))) will go back in the chain three times. The effective value of parent(?var) can be used with the in or not in operator like this: parent(?company) in (<urn:a>, <urn:b>).
The construction ?var -> uri (alternatively ?var o uri or just ?var uri) can be used to access additional values that are accessible through the property uri. In essence, this construction corresponds to the triple pattern value uri ?effectiveValue, where ?value is a value bound by the field var. The effective value of ?var -> uri can be used with the in or not in operator like this: ?company -> rdf:type in (<urn:c>, <urn:d>). It can be combined with parent() like this: parent(?company) -> rdf:type in (<urn:c>, <urn:d>).
The URI parameter can be a full URI within < > or the special string rdf:type (alternatively just type), which will be expanded to http://www.w3.org/1999/02/22-rdf-syntax-ns#type.
The construction graph(?var) can be used to access the RDF graph of a field's value. The typical use case is to sync only explicit values: graph(?a) not in (<http://www.ontotext.com/implicit>). The construction can be combined with parent() like this: graph(parent(?a)) in (<urn:a>).
Entity filters can be combined with default values in order to get more flexible behaviour.
A typical use-case for an entity filter is having soft deletes, i.e., instead of deleting an entity, it is marked as deleted by the presence of a specific value for a given property.
For example, if we create a connector instance such as:
and then insert some entities:
We could create the following index to specify a default value for city:
The default value will be used for entity:b as it has no value for city in the repository. As the value is "London", the entity will be synchronised.
Sometimes data represented in RDF is not well suited to map directly to non-RDF. For example, if we have news articles and they can be tagged with different concepts (locations, persons, events, etc.), one possible way to model this is a single property :taggedWith. Consider the following RDF data:
Now, if we want to map this data to Solr so that the property :taggedWith x is mapped to separate fields taggedWithPerson and taggedWithLocation according to the type of x (we are not interested in events), we can map :taggedWith twice to different fields and then use an entity filter to get the desired values:
Note that type is the short way to write <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>.
The six articles in the RDF data above will be mapped as such:
|Article URI||Entity mapped?||Value in taggedWithPerson||Value in taggedWithLocation||Explanation|
|:Article1||yes||:Einstein||:Berlin||:taggedWith has the values :Einstein, :Berlin and :Cannes-FF. The filter leaves only the correct values in the respective fields. The value :Cannes-FF is ignored as it does not match the filter.|
|:Article2||yes||:Berlin||:taggedWith has the value :Berlin. After the filter is applied, only taggedWithLocation is populated.|
|:Article3||yes||:Mozart||:taggedWith has the value :Mozart. After the filter is applied, only taggedWithPerson is populated|
|:Article4||yes||:Mozart||:Berlin||:taggedWith has the values :Berlin and :Mozart. The filter leaves only the correct values in the respective fields.|
|:Article5||yes||:taggedWith has no values. The filter is not relevant.|
|:Article6||yes||:taggedWith has the value :Cannes-FF. The filter removes it as it does not match.|
This can be checked by issuing a faceted search for taggedWithLocation and taggedWithPerson:
If the filter was applied, you should get only :Berlin for taggedWithLocation and only :Einstein and :Mozart for taggedWithPerson:
The following diagram shows a summary of all predicates that can administer (create, drop, check status) connector instances or issue queries and retrieve results. It can be used as a quick reference of what a particular predicate needs to be attached to. For example, to retrieve entities, you need to use :entities on a search instance and to retrieve snippets, you need to use :snippets on an entity. Variables that are bound as a result of a query are shown in green, blank helper nodes are shown in blue, literals in red, and URIs in orange. The predicates are represented by labelled arrows.
Even though SPARQL per se is not sensitive to the order of triple patterns, the Solr GraphDB Connector expects to receive certain predicates before others so that queries can be executed properly. In particular, predicates that specify the query or query options need to come before any predicates that fetch results.
The diagram in Overview of connector predicates provides a quick overview of the predicates.
GraphDB prior to 6.2 shipped with a version of the Solr GraphDB Connector that had different options and slightly different behaviour and internals. Unfortunately, it is not possible to migrate existing connector instances automatically. To prevent any data loss, the Solr GraphDB Connector will not initialise, if it detects an existing connector in the old format. The recommended way to migrate your existing instances is:
- backup the INSERT statement used to create the connector instance;
- drop the connector;
- deploy the new GraphDB version;
- modify the INSERT statement according to the changes described below;
- re-create the connector instance with the modified INSERT statement.
You might also need to change your queries to reflect any changes in field names or extra fields.
Prior to 6.2, a single field in the config could produce up to three individual fields on the Solr side, based on the field options. For example, for the field "firstName":
|firstName||produced, if the option "index" was true; used explicitly in queries|
|_facet_firstName||produced, if the option "facet" was true; used implicitly for facet search|
|_sort_firstName||produced, if the option "sort" was true; used implicitly for ordering connector results|
The current version always produces a single Solr field per field definition in the configuration. This means that you are responsible for creating all appropriate fields based on your needs. See more under Creation parameters.
|To mimic the functionality of the old _facet_fieldName fields, you can either create a non-analysed copy field (for textual fields) or just use the normal field (for non-textual fields).|
|To mimic the functionality of the old _sort_fieldName fields, you can create a non-analysed copy field (for textual fields) or just use the normal field (for non-textual fields).
Note that Solr imposes an additional requirement that sort fields have to be non-multivalued.
Prior to 6.2, the option manageExternalIndex could be used to control the management of both the schema and the core. In the current implementation, there are separate options, manageSchema and manageCore. For more information, see Schema and core management.