GraphDB-Enterprise Administration

compared with
Current by dinko.tenev
on Feb 03, 2015 18:28.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (13)

View Page History
h1. Manual Replication

The master node automatically initiates replication, when {{AutoReplication}} is set to {{true}} and it detects any of the necessary conditions, such as inconsistent state of a node with respect to the most recent update available (comparing the node's fingerprint, number of successful updates, number of statements etc). Otherwise, the replication can be initiated manually using the {{startReplication}} operation.
_The master node automatically initiates replication of worker nodes as needed or appropriate. The manual replication option is obsolete and no longer available._


h1. Online backup and restore

The deep-replication mechanism is also present in the master node and this is the basis for replication to a remote cluster (see below), as well as for online backup and restore.
Cluster-wide backup and restore can be used to restore the cluster back to a previous operational state.

Backups are made by copying a worker node's image to the specified path on the master node machine. Restoring an image is made from a local image on the master node and replicated to all worker nodes. The first step for preparing a backup operation is to ensure that the replication parameters for the master node have been set up - there are no default values for these. In the {{ClusterInfo}} MBean, set the following parameters: {{MasterReplicationPort}} should be set to an unused port number and {{MasterUrl}} should be set to the actual sesame server URL that is used to remotely access the cluster master.
Backups are made by copying a worker node's image to a location on the master node's machine. Restoring an image is made from a local image on the master node and replicated throughout worker nodes in the cluster, propagating via peer master nodes as needed.

To start a backup, invoke the {{backup}} operation. This takes a single parameter, which is the path to a directory on the master node where the image will be stored. Ensure that the server (Tomcat) has sufficient rights to this directory. *If this directory already exists then any files in it will be deleted*. Two notification messages will be sent, one to say that the backup operation has been started and another to say if it has successfully completed.
To start a backup, invoke the {{backup}} operation with a single parameter, the name used to identify the image. The backup image will go into a directory under the master node's repository data directory. At least two notification messages will be sent under normal operation, one to indicate that the backup operation has been started, and another to indicate that it has completed successfully.

To restore the cluster state from a backup image, invoke the {{restoreFromImage}} operation. This takes one parameter that is the path to the directory where the image is stored. Once started, the master will copy this image to each of the worker nodes in turn.
To restore the cluster state from a backup image, invoke the {{restoreFromImage}} operation with an existing backup's name as a parameter. The image created by the backup will be replicated to every worker throughout the cluster. At least two notification messages will be sent under normal operation, one to indicate that the restore operation has been started, and another to indicate that it has completed successfully.

Important usage notes:
* *The image name "default" is reserved for internal use.* Backing-up to an image named "default" would interfere with the cluster's failure recovery capabilities.
* Backing-up a second time with the same image name will overwrite the previous image of that name.
* Each master node maintains its own set of backup images. The set of backup images available from one master would be completely unrelated to the sets of images available from other masters.
* After successfully executing a restore operation, any updates executed from the time the backup image was created to the time the cluster was successfully restored will be lost irreversibly.
* Upon failure of a restore operation, the cluster may be in an inconsistent state.
* Trying to initiate backup or restore while another backup or restore operation is already in progress on the same master node will be treated as an error and will result in immediate failure. The failure of the offending operation will not interfere with carrying out the operation that was already in progress.

h1. Remote replication

It is sometimes desirable to maintain two clusters for the purpose of disaster recovery. When the network link between two data-centres is fast and reliable, then GraphDB instances in both data-centres can be connected up in to a single GraphDB cluster. This is the preferred approach.

However, in situations where the network link between two data-centres is poor (slow, unreliable, high transience, etc) then a better approach for keeping the two data-centres (i.e. two GraphDB clusters) in synch will be to use the 'remote replication'. Using this feature, a master node of one cluster can be added as a pseudo-worker not to the other cluster making a hierarchy of clusters. Master nodes for remote clusters added in this way do not take part in query answering, but do receive all updates. Also many of the time-out and synchronisation parameters for the remote cluster are relaxed in order to cope with a more troublesome network layer.

When a remote master is added to another master, it has its own set of worker nodes. In such a configuration, each update handled by the remote master is slower, because it needs to update its own worker nodes. Before adding the remote master, set the RemoteMaster attribute to true on this node. This attribute value indicates that the remote master will receive only updates, but no read requests by the controlling master. The updates will be queued and sent asynchronously from the rest of the worker nodes so that no delay in the operation of the controlling master occurs. Incremental replications are made more likely when synchronising a remote master by increasing the tolerated distance between its current and expected state. Deep replications are triggered when the remote master cannot be made up-to-date, based on the sequence of stored updates in the transaction log. If a deep replication is necessary, a regular worker node is selected to send its contents and during the replication, the controlling master will not be in writable state. However, in cases when the remote master is only few updates behind, an incremental replication will occur and the controlling master will remain in a writable state and will be able to accept and process updates.
_Remote replication is no longer available._