LVM-based Backup and Replication

compared with
Current by Reneta Popova
on Sep 18, 2014 14:39.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (20)

View Page History
h1. General

In essence, the Linux LVM-based Backup and Replication uses shell scripts to find out the logical volume and volume group, where the repository storage folder resides and then creates a filesystem snapshot. Once the snapshot is created, the repository is available for reads and writes while the maintenance operation is still in-progress. When it finishes, the snapshot is removed and the changes are merged back to the filesystem.

By default, the feature is disabled. To enable it, do the following:
* Get the scripts, which are located within in the "lvmscripts" folder of in the distribution.
* Place them on each of the workers in a chosen folder.
* Set the system property (JVM's \-D) named "lvm-scripts" - \*e.g. \-Dlvm-scripts=<folder-with-the-scripts> {{lvm-scripts}}, e.g. {{\-Dlvm-scripts=<folder-with-the-scripts>}} to point to the folder with the scripts.

{note} GraphDB will check checks if the folder contains scripts named: create-snapshot.sh, release-snapshot.sh, and locatelvm.sh. This will be is done the first time you try to get the repository storage folder contents. For example, when you need to do backup or to perform full-replication.{note}

The following preconditions need to be met:
- the OS is 'linux';
- the above property points to a folder with these scripts;
Prerequisites:
- Linux OS;
- The system property (JVM's \-D) named {{lvm-scripts}} should point to the folder with the above scripts;
- tThe folder we you are about to backup or use for replication contains a file named "owlim.properties";
- and that That folder DOES NOT HAVE a file named "lock".
All of the above mean that the repository storage is "ready" for maintenance.

GraphDB will execute executes the script "locatelvm.sh" with a single parameter, which is the pathname of the storage folder, from where you want to transfer the data (either to perform backup or to replicate it to another node). While invoking it, GraphDB captures the script standard and error output streams, in order to get the logical volume, volume group, and the storage location, relative to the volume's mount point.

GraphDB also checks the exit code of the script (MUST be 0) and fetches the locations by processing the script output., e.g. it must contain the logical volume (after, 'lv='), the volume group ('vg='), and the relative path ('local=') from the mount point of the folder supplied as argument of the script. a script argument.

If the storage folder is not located on a LVM2 managed volume, the script will fail with some a different exit code (rely (it relies on the exit code of the lvs command) and the whole operation will revert back to the "classical" way of doing it - same as per any of the previous versions.

If it succeeds to find the volume group and the logical volume then the "create-snapshot.sh" script is executed (the logical volume and volume groups are passed as environment variables named LV and VG preset by OWLIM when the script is executed) which then creates a snapshot named after the value of $BACKUP variable (see config.sh script which also define where the snapshot will be mounted).
If it succeeds to find the volume group and the logical volume, then the "create-snapshot.sh" script is executed, which then creates a snapshot named after the value of $BACKUP variable (see config.sh script, which also defines where the snapshot will be mounted). The logical volume and volume groups are passed as environment variables, named LV and VG preset by GraphDB, when the script is executed.

If that also pass it passes without any errors (script exit code = 0), the node is immediately initialized so initialised in order to be available for further operations (reads and writes).

The actual maintenance operation will use now use the data from the 'backup' volume instead from where it is mounted.

When the data transfer completes (either with error, canceled or normally) Owlim normally), GraphDB invokes the '.release-snapshot.sh' scripts, which unmounts the backup volume and removes it so that it. This way the data changes are merged back with the original volume.

h2. Some further notes

The scripts rely on a root access to do "mount" and also to create and remove snapshot volumes. We make use of SUDO_ASKPASS variable which is set to point to askpass.sh script from the same folder. All commands that need privilege are executed using "sudo \-A" which invokes the command pointed by SUDO_ASKPASS variable which simply spit out the required password to its standard output. One should alter the askpass.sh accordingly
The scripts rely on a root access to do "mount", and also to create and remove snapshot volumes. The SUDO_ASKPASS variable is set to point to the askpass.sh script from the same folder. All commands that need privilege are executed using {{sudo \-A}}, which invokes the command pointed by the SUDO_ASKPASS variable. The latter simply spits out the required password to its standard output. One should alter the askpass.sh accordingly.

We know that messing around with root privileges it is a severe security risk but that was the straightforward way to archive what was needed.

The administrators are free to achieve the same by any other, less exposing and dangerous means to do that properly.

During the lvm based maintenance session, owlim will create two additional files (zero size) within the scripts folder, named "snapshot-lock" - indicate that a session is started and "snapshot-created" to indicate a successful completion of the "create-snapshot.sh" script. These are used to avoid other threads or processes to mess-up with the maintenance operation that has been initiated and is still in-progress...
During the lvm based maintenance session, GraphDB will create two additional files (zero size) in the scripts folder, named "snapshot-lock", indicating that a session is started, and "snapshot-created", indicating a successful completion of the "create-snapshot.sh" script. They are used to avoid other threads or processes to mess up with the maintenance operation that has been initiated and is still in-progress.