Changing the IP Address of a Cassandra Node with auto_bootstrap:false

Changing the IP address of a Cassandra node is a common maintenance operation. It is done when using offsite backups (via for example tablesnap [1]) to replace a failed node or when doing an in-place upgrade of a cassandra node’s hardware.

Retaining data and changing IP is especially common on hosted virtualized environments like EC2, where operators do not have control over the IP addresses assigned to their instances. The hidden [2] auto_bootstrap cassandra.yaml configuration option allows an operator to change the IP address of a node easily.

If your node is in the following state :

  1. Has all the non-system-keyspace user data it had when it stopped, either because you didn’t lose it or because you restored it from an offsite backup.
  2. Still has the same cassandra.yaml file it had before the IP change, excepting the change of IP and including token information. If using a single token per node, this cassandra.yaml file must contain an explicit initial_token. If using vnodes, “initial_token” must contain a comma delimited list of all randomly-assigned tokens, obtainable from “nodetool info –tokens”. num_tokens will be ignored if initial_token is set, unless initial_token contains one and only one token. For safety, I recommend commenting out num_tokens entirely when setting initial_token. [3]

This is the process to change its IP address :

  1. Modify cassandra.yaml by adding a line which sets node to not auto bootstrap, like (notice the space after the colon):
    auto_bootstrap: false
  2. Start the node.
  3. Modify cassandra.yaml to remove this line:
    auto_bootstrap: false
  4. If your node has been down for longer than max_hint_window_in_ms, you will need to repair it, without the “-pr” option so that it repairs its non-primary replicas. For this reason, I recommend increasing max_hint_window_in_ms from its low default of one hour.

This procedure works because starting a node with auto_bootstrap:false and defined tokens is telling the cluster “I am taking over these token ranges, no questions asked.”

Here is the comment from StorageService.java explaining this behavior:

 // We bootstrap if we haven't successfully bootstrapped before, as long as we are not a seed.
// If we are a seed, or if the user manually sets auto_bootstrap to false,
// we'll skip streaming data from other nodes and jump directly into the ring.

This non-bootstrap code path then calls replacedEndpoint on the old version of the node with the specified range(s) and old IP address, leaving the new node as the sole owner.

[1] http://github.com/synack/tablesnap

[2] http://issues.apache.org/jira/browse/CASSANDRA-2447?focusedCommentId=13083551&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13083551

[3] Alternately, a vnode user can restore the node’s tokens by restoring the system keyspace alongside user data.