Backing Up
Riak KV is a clustered system built to survive a wide range of failure scenarios, including the loss of nodes due to network or hardware failure. Although this is one of Riak KV’s core strengths, it cannot withstand all failure scenarios.
Backing up data (duplicating the database on a different long-term storage system) is a common approach to mitigating potential failure scenarios.
This page covers how to perform backups of Riak KV data.
Overview
Riak KV backups can be performed using operating system features or filesystems that support snapshots, such as LVM or ZFS, or by using tools like rsync or tar.
Choosing your Riak KV backup strategy will depend on your already-established backup methodologies and the backend configuration of your nodes.
The basic process for getting a backup of Riak KV from a node is as follows:
- Stop Riak KV with
riak stop
. - Backup the appropriate data, ring, and configuration directories.
- Start Riak KV.
Downtime of a node can be significantly reduced by using an OS feature or filesystem that supports snapshotting.
Due to Riak KV’s eventually consistent nature, backups can become slightly inconsistent from node to node.
Data could exist on some nodes and not others at the exact time a backup is made. Any inconsistency will be corrected once a backup is restored, either by Riak’s active anti-entropy processes or when the object is read, via read repair.
OS-Specific Directory Locations
The default Riak KV data, ring, and configuration directories for each of the supported operating systems is as follows:
Debian and Ubuntu
Data | Directory |
---|---|
Bitcask | /var |
LevelDB | /var |
Ring | /var |
Configuration | /etc |
Cluster Metadata | /var |
Search | /var |
Strong consistency | /var |
Fedora and RHEL
Data | Directory |
---|---|
Bitcask | /var |
LevelDB | /var |
Ring | /var |
Configuration | /etc |
Cluster Metadata | /var |
Search | /var |
Strong consistency | /var |
FreeBSD
Data | Directory |
---|---|
Bitcask | /var |
LevelDB | /var |
Ring | /var |
Configuration | /usr |
Cluster Metadata | /var |
Search | /var |
Strong consistency | /var |
OS X
Data | Directory |
---|---|
Bitcask | . |
LevelDB | . |
Ring | . |
Configuration | . |
Cluster Metadata | . |
Search | . |
Strong consistency | . |
Note: OS X paths are relative to the directory in which the package was extracted.
SmartOS
Data | Directory |
---|---|
Bitcask | /var |
LevelDB | /var |
Ring | /var |
Configuration | /opt |
Cluster Metadata | /var |
Search | /var |
Strong consistency | /var |
Solaris
Data | Directory |
---|---|
Bitcask | /opt |
LevelDB | /opt |
Ring | /opt |
Configuration | /opt |
Cluster Metadata | /opt |
Search | /opt |
Strong consistency | /opt |
Performing Backups
In previous versions of Riak KV, there was a riak-admin backup
command commonly used for
backups. This functionality is now deprecated. We strongly recommend using the backup procedure documented below instead.
Backups can be accomplished through a variety of common methods. Standard utilities such cp
, rsync
, and tar
can be used, as well as any backup system already in place in your environment.
A simple shell command, like those in the following examples, are sufficient for creating a backup of your Bitcask or LevelDB data, ring, and Riak KV configuration directories for a binary package-based Riak KV Linux installation.
The following examples use tar
:
Backups must be performed on while Riak KV is stopped to prevent data loss.
Bitcask
tar -czf /mnt/riak_backups/riak_data_`date +%Y%m%d_%H%M`.tar.gz \
/var/lib/riak/bitcask /var/lib/riak/ring /etc/riak
LevelDB
tar -czf /mnt/riak_backups/riak_data_`date +%Y%m%d_%H%M`.tar.gz \
/var/lib/riak/leveldb /var/lib/riak/ring /etc/riak
Cluster Metadata
tar -czf /mnt/riak_backups/riak_data_`date +%Y%m%d_%H%M`.tar.gz \
/var/lib/riak/cluster_meta
Search / Solr Data
tar -czf /mnt/riak_backups/riak_data_`date +%Y%m%d_%H%M`.tar.gz \
/var/lib/riak/yz
Strong Consistency Data
Persistently stored data used by Riak’s strong consistency feature can be stored in an analogous fashion:
tar -czf /mnt/riak_backups/riak_data_`date +%Y%m%d_%H%M`.tar.gz \
/var/lib/riak/ensembles
Restoring a Node
The method you use to restore a node will differ depending on a combination of factors, including node name changes and your network environment.
If you are replacing a node with a new node that has the same node name (typically a fully qualified domain name or IP address), then restoring the node is a simple process:
- Install Riak on the new node.
- Restore your old node’s configuration files, data directory, and ring directory.
- Start the node and verify proper operation with
riak ping
,riak-admin status
, and other methods you use to check node health.
If the node name of a restored node (-name
argument in vm.args
or
nodename
parameter in riak.conf
) is different than the name of the
node that the restored backup was taken from, you will need to
additionally:
- Mark the original instance down in the cluster using
riak-admin down <node>
- Join the restored node to the cluster using
riak-admin cluster join <node>
- Replace the original instance with the renamed instance with
riak-admin cluster force-replace <node1> <node2>
- Plan the changes to the cluster with
riak-admin cluster plan
- Finally, commit the cluster changes with
riak-admin cluster commit
For more information on the riak-admin cluster
commands, refer to our documentation on cluster administration.
For example, if there are five nodes in the cluster with the original node names riak1.example.com
through riak5.example.com
and you wish to restore riak1.example.com
as riak6.example.com
, you would execute the following commands on riak6.example.com
.
Join to any existing cluster node.
Shellriak-admin cluster join riak@riak2.example.com
Mark the old instance down.
Shellriak-admin down riak@riak1.example.com
Force-replace the original instance with the new one.
Shellriak-admin cluster force-replace \ riak@riak1.example.com riak@riak6.example.com
Display and review the cluster change plan.
Shellriak-admin cluster plan
Commit the changes to the cluster.
Shellriak-admin cluster commit
Your configuration files should also be changed to match the new name in addition to running the commands (the -name
setting in vm.args
in the older config system, and the nodename
setting in riak.conf
in the newer system).
If the IP address of any node has changed, verify that the changes are reflected in your configuration files to ensure that the HTTP and Protocol Buffers interfaces are binding to the correct addresses.
A robust DNS configuration can simplify the restore process if the IP addresses of the nodes change, but the hostnames are used for the node names and the hostnames stay the same. Additionally, if the HTTP and Protocol Buffers interface settings are configured to bind to all IP interfaces (0.0.0.0), then no changes will need to be made to your configuration files.
When performing restore operations involving riak-admin cluster force-replace
, we recommend that you start only one node at a time and verify that each node that is started has the correct name for itself
and for any other nodes whose names have changed:
- Verify that the correct name is present your configuration file.
- Once the node is started, run
riak attach
to connect to the node. The prompt obtained should contain the correct node name.- (It may be necessary to enter an Erlang atom by typing
x.
and pressing Enter)
- (It may be necessary to enter an Erlang atom by typing
- Disconnect from the attached session with Ctrl-G + q.
- Finally, run
riak-admin member_status
to list all of the nodes and verify that all nodes listed have the correct names.
Restoring a Cluster
Restoring a cluster from backups is documented on its own page.