Changing Cluster Information
Change the Node Name
The node name is an important setting for the Erlang VM, especially when you want to build a cluster of nodes, as the node name identifies both the Erlang application and the host name on the network. All nodes in the Riak cluster need these node names to communicate and coordinate with each other.
In your configuration files, the node name defaults to riak@127.0.0.1
.
To change the node name, change the following line:
Change it to something that corresponds to either the IP address or a resolvable host name for this particular node, like so:
Change the HTTP and Protocol Buffers binding address
By default, Riak’s HTTP and Protocol Buffers services are bound to the local interface, i.e. 127.0.0.1, and are therefore unable to serve requests from the outside network. The relevant setting is in your configuration files:
# For HTTP
listener.http.internal = 127.0.0.1:8098
# For Protocol Buffers
listener.protobuf.internal = 127.0.0.1:8087
% In the riak_api section
% For HTTP
{http, [ {"127.0.0.1", 8098 } ]},
% For Protocol Buffers
{pb, [ {"127.0.0.1", 8087} ] },
Either change it to use an IP address that corresponds to one of the server’s network interfaces, or 0.0.0.0 to allow access from all interfaces and networks, e.g.:
listener.http.internal = 0.0.0.0:8098
% In the riak_core section
{http, [ {"0.0.0.0", 8098 } ]},
The same configuration should be changed for the Protocol Buffers interface if you intend on using it (which we recommend). Change the following line:
listener.protobuf.internal = 0.0.0.0:8087
% In the riak_core section
{pb, [ {"0.0.0.0", 8087} ] },
Rename Single Node Clusters
To rename a single-node development cluster:
Stop the node with
riak stop
.Change the node’s
nodename
parameter inriak.conf
, or-name
parameter invm.args
to the new name.Change any IP addresses in
riak.conf
orapp.config
if necessary. Specifically:listener.protobuf.$name
,listener.http.$name
, andlistener.https.$name
inriak.conf
, andpb_ip
,http
,https
, andcluster_mgr
inapp.config
.Delete the contents of the node’s
ring
directory. The location of the ring directory is the value for thering.state_dir
inriak.conf
, orring_state_dir
inapp.config
.Start Riak on the node with
riak start
.
Rename Multi-Node Clusters
For multi-node clusters, a rename is a slightly more complex procedure; however, it is very similar to the process for renaming a single node.
Previous to Riak version 1.2, a cluster node’s name could only be changed with the riak admin reip
command, which involves downtime for the entire cluster. As of Riak version 1.2, that method has been superseded by riak admin cluster force-replace
, which is safer and does not require cluster wide downtime.
There still exist scenarios that require nodes to be renamed while stopped, such as seeding a cluster with backups from another cluster that does not share the same node names. Please see the Clusters from Backups section for more details on renaming in this scenario.
The following example describes reconfiguring node names with the new riak admin cluster force-replace
method.
Example Scenario
For this example scenario, Riak is operating in a cluster of 5 nodes with the following network configuration:
riak@10.1.42.11
onnode1.localdomain
→ IP address changing to 192.168.17.11riak@10.1.42.12
onnode2.localdomain
→ IP address changing to 192.168.17.12riak@10.1.42.13
onnode3.localdomain
→ IP address changing to 192.168.17.13riak@10.1.42.14
onnode4.localdomain
→ IP address changing to 192.168.17.14riak@10.1.42.15
onnode5.localdomain
→ IP address changing to 192.168.17.15
The above list shows the network configuration details for our 5 nodes, including the Erlang node name value, the node’s fully qualified domain name, and the new IP address each node will be configured to use.
The nodes in our example cluster are currently configured to use the 10.1.42. private subnetwork range. Our goal for this example will be to configure the nodes to instead use the 192.168.17. private subnetwork range and do so in a rolling fashion without interrupting cluster operation.
Process
This process can be accomplished in three phases. The details and steps required of each phase are presented in the following section.
- Down the node to be reconfigured
- Reconfigure node to use new address
- Repeat previous steps on each node
Down the Node
Stop Riak on
node1.localdomain
:Shellriak stop
The output should look like this:
Attempting to restart script through sudo -H -u riak ok
From the
node2.localdomain
node, markriak@10.1.42.11
down:Shellriak admin down riak@10.1.42.11
Successfully marking the node down should produce output like this:
ShellAttempting to restart script through sudo -H -u riak Success: "riak@10.1.42.11" marked as down
This step informs the cluster that
riak@10.1.42.11
is offline and ring-state transitions should be allowed. While we’re executing theriak admin down
command fromnode2.localdomain
in this example, the command can be executed from any currently running node.
Reconfigure Node to Use New Address
Reconfigure node1.localdomain
to listen on the new private IP address 192.168.17.11 by following these steps:
Change the node’s
nodename
parameter inriak.conf
, or-name
parameter invm.args
, to reflect the new node name. For example:riak.conf
:nodename = riak@192.168.17.11
vm.args
:-name riak@192.168.17.11
Change any IP addresses to 192.168.17.11 in
riak.conf
orapp.config
as previously described in step 3 of Single Node Clusters.Rename the node’s
ring
directory, the location of which is described in step 4 of Single Node Clusters. You may rename it to whatever you like, as it will only be used as a backup during the node renaming process.Start Riak on
node1.localdomain
.Shellriak start
Join the node back into the cluster.
Shellriak admin cluster join riak@10.1.42.12
Successful staging of the join request should have output like this:
ShellAttempting to restart script through sudo -H -u riak Success: staged join request for 'riak@192.168.17.11' to 'riak@10.1.42.12'
Use
riak admin cluster force-replace
to change all ownership references fromriak@10.1.42.11
toriak@192.168.17.11
:Shellriak admin cluster force-replace riak@10.1.42.11 riak@192.168.17.11
Successful force replacement staging output looks like this:
ShellAttempting to restart script through sudo -H -u riak Success: staged forced replacement of 'riak@10.1.42.11' with 'riak@192.168.17.11'
Review the new changes with
riak admin cluster plan:
Shellriak admin cluster plan
Example output:
ShellAttempting to restart script through sudo -H -u riak =========================== Staged Changes ============================ Action Nodes(s) ----------------------------------------------------------------------- join 'riak@192.168.17.11' force-replace 'riak@10.1.42.11' with 'riak@192.168.17.11' ----------------------------------------------------------------------- WARNING: All of 'riak@10.1.42.11' replicas will be lost NOTE: Applying these changes will result in 1 cluster transition ####################################################################### After cluster transition 1/1 ####################################################################### ============================= Membership ============================== Status Ring Pending Node ----------------------------------------------------------------------- valid 20.3% -- 'riak@192.168.17.11' valid 20.3% -- 'riak@10.1.42.12' valid 20.3% -- 'riak@10.1.42.13' valid 20.3% -- 'riak@10.1.42.14' valid 18.8% -- 'riak@10.1.42.15' ----------------------------------------------------------------------- Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0 Partitions reassigned from cluster changes: 13 13 reassigned from 'riak@10.1.42.11' to 'riak@192.168.17.11'
Commit the new changes to the cluster with
riak admin cluster commit
:Shellriak admin cluster commit
Output from the command should resemble this example:
ShellAttempting to restart script through sudo -H -u riak Cluster changes committed
Check that the node is participating in the cluster and functioning as expected:
Shellriak admin member-status
Output should resemble this example:
ShellAttempting to restart script through sudo -H -u riak ============================= Membership ============================== Status Ring Pending Node ----------------------------------------------------------------------- valid 20.3% -- 'riak@192.168.17.11' valid 20.3% -- 'riak@10.1.42.12' valid 20.3% -- 'riak@10.1.42.13' valid 20.3% -- 'riak@10.1.42.14' valid 18.8% -- 'riak@10.1.42.15' ----------------------------------------------------------------------- Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
Monitor hinted handoff transfers to ensure they have finished with the
riak admin transfers
command.Clean up by deleting the renamed
ring
directory once all previous steps have been successfully completed.
When using the riak admin force-replace
command, you will always get a
warning message like: WARNING: All of 'riak@10.1.42.11' replicas will be
lost
. Since we didn’t delete any data files and we are replacing the node
with itself under a new name, we will not lose any replicas.
Repeat previous steps on each node
Repeat the steps above for each of the remaining nodes in the cluster.
Use riak@192.168.17.11 as the target node for further riak admin cluster join
commands issued from subsequently reconfigured nodes to join those nodes to the cluster.
riak admin cluster join riak@192.168.17.11
A successful join request staging produces output similar to this example:
Attempting to restart script through sudo -H -u riak
Success: staged join request for 'riak@192.168.17.12' to 'riak@192.168.17.11'
Clusters from Backups
The above steps describe a process for renaming nodes in a running cluster. When seeding a new cluster with backups where the nodes must have new names, typically done as a secondary cluster or in a disaster recovery scenario, a slightly different process must be used. This is because the node names must resolve to the new hosts in order for the nodes to start and communicate with each other.
Expanding on the Example Scenario above, the below steps can be used to rename nodes in a cluster that is being restored from backups. The below steps assume every node is offline, and they will indicate when to bring each node online.
Bringing Up the First Node
In order to bring our first node online, we’ll first need to use the riak admin reip
command on a single node. In this example, we’ll use riak@10.1.42.11
as our first node.
In
riak.conf
changenodename
,-name
invm.args
, fromriak@10.1.42.11
to your new nodename,riak@192.168.17.11
.On
node1.localdomain
runriak admin reip riak@10.1.42.11 riak@192.168.17.11
. This will change the name ofriak@10.1.42.11
toriak@192.168.17.11
in the Riak ring.Start Riak on
node1.localdomain
.Once Riak is started on
node1.localdomain
, mark the rest of the nodes in the cluster down, usingriak admin down
. For example, we would downriak@10.1.42.12
withriak admin down riak@10.1.42.12
.Confirm every other node in the cluster is marked down by running
riak admin member-status
onnode1.localdomain
:Shell================================= Membership ================================== Status Ring Pending Node ------------------------------------------------------------------------------- valid 20.3% -- 'riak@192.168.17.11' down 20.3% -- 'riak@10.1.42.12' down 20.3% -- 'riak@10.1.42.13' down 20.3% -- 'riak@10.1.42.14' down 18.8% -- 'riak@10.1.42.15' ------------------------------------------------------------------------------- Valid:1 / Leaving:0 / Exiting:0 / Joining:0 / Down:4
Ensure
riak@192.168.17.11
is listed as the claimant by runningriak admin ring-status
onnode1.localdomain
:Shell================================== Claimant =================================== Claimant: 'riak@192.168.17.11' Status: up Ring Ready: true ============================== Ownership Handoff ============================== No pending changes. ============================== Unreachable Nodes ============================== All nodes are up and reachable
Once all nodes are marked as down and our first node is listed as the claimant, we can proceed with the rest of the nodes.
Bringing Up the Remaining Nodes
On each of the remaining nodes, change
nodename
inriak.conf
, or-name
invm.args
as described above.Move aside the ring directory. As in Multi-Node Clusters, we will save this ring directory as a backup until were finished.
Start each node. They will start as if they are each a member of their own cluster, but will retain their restored data.
Join each node to our first node using
riak admin cluster join riak@192.168.17.11
.Force replace each node with its old node name. For example,
riak admin cluster force-replace riak@10.1.42.12 riak@192.168.17.12
.Once the above is complete for each node, run
riak admin cluster plan
on any node. The output should look similar to below:Shell=============================== Staged Changes ================================ Action Details(s) ------------------------------------------------------------------------------- force-replace 'riak@10.1.42.12' with 'riak@192.168.17.12' force-replace 'riak@10.1.42.13' with 'riak@192.168.17.13' force-replace 'riak@10.1.42.14' with 'riak@192.168.17.14' force-replace 'riak@10.1.42.15' with 'riak@192.168.17.15' join 'riak@192.168.17.12' join 'riak@192.168.17.13' join 'riak@192.168.17.14' join 'riak@192.168.17.15' ------------------------------------------------------------------------------- WARNING: All of 'riak@10.1.42.12' replicas will be lost WARNING: All of 'riak@10.1.42.13' replicas will be lost WARNING: All of 'riak@10.1.42.14' replicas will be lost WARNING: All of 'riak@10.1.42.15' replicas will be lost NOTE: Applying these changes will result in 1 cluster transition ############################################################################### After cluster transition 1/1 ############################################################################### ================================= Membership ================================== Status Ring Pending Node ------------------------------------------------------------------------------- valid 20.3% -- 'riak@192.168.17.11' valid 20.3% -- 'riak@192.168.17.12' valid 20.3% -- 'riak@192.168.17.13' valid 20.3% -- 'riak@192.168.17.14' valid 18.8% -- 'riak@192.168.17.15' ------------------------------------------------------------------------------- Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0 Partitions reassigned from cluster changes: 51 13 reassigned from 'riak@10.1.42.12' to 'riak@192.168.17.12' 13 reassigned from 'riak@10.1.42.13' to 'riak@192.168.17.13' 13 reassigned from 'riak@10.1.42.14' to 'riak@192.168.17.14' 12 reassigned from 'riak@10.1.42.15' to 'riak@192.168.17.15'
If the above plan looks correct, commit the cluster changes with
riak admin cluster commit
.Once the cluster transition has completed, all node names should be changed and be marked as valid in
riak admin member-status
like below:Shell================================= Membership ================================== Status Ring Pending Node ------------------------------------------------------------------------------- valid 20.3% -- 'riak@192.168.17.11' valid 20.3% -- 'riak@192.168.17.12' valid 20.3% -- 'riak@192.168.17.13' valid 20.3% -- 'riak@192.168.17.14' valid 18.8% -- 'riak@192.168.17.15' ------------------------------------------------------------------------------- Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0