Prepare for the migration

To prepare for the migration, record the port, UUID, and the location of the write-ahead log on the existing master. Decide the number of masters that you want to use. Then select an unused machine from the cluster and configure it as the new master.

  1. Establish a maintenance window (one hour should be sufficient). During this time the Kudu cluster will be unavailable.
  2. Decide how many masters to use. The number of masters should be odd. Three or five node master configurations are recommended; they can tolerate one or two failures respectively.
  3. Perform the following preparatory steps for the existing master:
    • Identify and record the directories where the master’s write-ahead log (WAL) and data live. If using Kudu system packages, their default locations are /var/lib/kudu/master, but they may be customized using the fs_wal_dir and fs_data_dirs configuration parameters. The command below assume that fs_wal_dir is /data/kudu/master/wal and fs_data_dirs is /data/kudu/master/data. Your configuration may differ. For more information on configuring these directories, see Apache Kudu configuration.

    • Identify and record the port the master is using for RPCs. The default port value is 7051, but it may have been customized using the rpc_bind_addresses configuration parameter.

    • Identify the master’s UUID. It can be fetched using the following command:

      $ sudo -u kudu kudu fs dump uuid --fs_wal_dir=<master_wal_dir> [--fs_data_dirs=<master_data_dir>] 2>/dev/null
      master_data_dir

      The location of the existing master’s previously recorded data directory.

      For example:
      $ sudo -u kudu kudu fs dump uuid --fs_wal_dir=/var/lib/kudu/master 2>/dev/null
      4aab798a69e94fab8d77069edff28ce0
    • (Optional) Configure a DNS alias for the master. The alias could be a DNS cname (if the machine already has an A record in DNS), an A record (if the machine is only known by its IP address), or an alias in /etc/hosts. The alias should be an abstract representation of the master (e.g. master-1).

  4. If you have Kudu tables that are accessed from Impala, you must update the master addresses in the Apache Hive Metastore (HMS) database.
    • If you set up the DNS aliases, run the following statement in impala-shell, replacing master-1, master-2, and master-3 with your actual aliases.
      ALTER TABLE table_name
      SET TBLPROPERTIES
      ('kudu.master_addresses' = 'master-1,master-2,master-3');
    • If you do not have DNS aliases set up, see Step #11 in the Performing the migration section for updating HMS.
  5. Perform the following preparatory steps for each new master:
    • Choose an unused machine in the cluster. The master generates very little load so it can be collocated with other data services or load-generating processes, though not with another Kudu master from the same configuration.

    • Ensure Kudu is installed on the machine, either using system packages (in which case the kudu and kudu-master packages should be installed), or some other means.

    • Choose and record the directory where the master’s data will live.

    • Choose and record the port the master should use for RPCs.

    • (Optional) Configure a DNS alias for the master (e.g. master-2, master-3, etc).