Migrating a Deployment to a New Set of Hosts
This section describes how to migrate a Cloudera Data Science Workbench deployment to a new set of gateway hosts.
Migrating a CSD Deployment
This section describes how to migrate a CSD-based Cloudera Data Science Workbench service to a new set of gateway hosts.
- Add and Set Up the New Hosts
- Copy the JDK to the new host
- Copy the DNS Nameserver to the new host
- Copy the Kerberos Configurations
- Stop the CDSW Service
- Backup Application Data
- Delete CDSW Roles from Existing Hosts
- Move Backup to the New Master
- Update DNS Records for the New Master
- Add Role Instances for the New Hosts
- Run the Prepare Node command on the New Hosts
- Start the CDSW Service
Add and Set Up the New Hosts
-
Add new hosts to your cluster as needed. Make sure they are gateway hosts that have been assigned gateway roles for HDFS, YARN, and Spark 2. Do not run any other services on these hosts.
-
Set up the new hosts as per the Cloudera Data Science Workbench hardware requirements listed here.
- Disable Untrusted SSH Access on the new hosts.
- Configure Block Devices on the new hosts.
Copy the JDK to the new host
Copy the /usr/java directory to the new host.
Copy the DNS Nameserver to the new host
Copy the /etc/resolv.conf file to the new host.
Copy the Kerberos Configurations
Copy the /etc/jkr5.conf file to the new host.
Stop the CDSW Service
- Log into the Cloudera Manager Admin Console.
- On the Stop from the dropdown. tab, click to the right of the CDSW service and select
- Confirm your choice on the next screen. When you see a Finished status, the action is complete.
Backup Application Data
In Cloudera Data Science Workbench all stateful data is stored on the master host at /var/lib/cdsw. Backup the contents of this directory before you begin the migration process.
- Stop Cloudera Data Science Workbench.
- After stopping CDSW, and before running the following tar command, wait 2-5 minutes (depending on your disk speed) to ensure that all data from CDSW is successfully written to the disks. Otherwise the tar command may not capture all recent changes.
- To create the backup, run the following command on the master host:
tar -cvzf cdsw.tar.gz -C /var/lib/cdsw/ .
Delete CDSW Roles from Existing Hosts
- Log into the Cloudera Manager Admin Console.
- Go to the CDSW service.
- Click the Instances tab.
- Select all the role instances.
- Select Actions for Selected > Delete. Click Delete to confirm the deletion.
Move Backup to the New Master
tar xvzf cdsw.tar.gz -C /var/lib/cdsw
Update DNS Records for the New Master
Update your DNS records with the IP address for the new master host.
Add Role Instances for the New Hosts
- Log into the Cloudera Manager Admin Console.
- Go to the CDSW service.
- Click the Instances tab.
- Click Add Role Instances. Assign the Cloudera Data Science Workbench Master, Application, and Docker Daemon roles to the new master host. If you want to configure worker hosts, assign the Cloudera Data Science Workbench Worker and Docker Daemon roles to the new workers.
- Click Continue. On the Review Changes page, review the configuration changes to be applied. The wizard finishes by performing any actions necessary to
add the new role instances.
Do not start the new roles at this point. You must run the Prepare Node command as described in the next step before the roles are started.
Run the Prepare Node command on the New Hosts
nfs-utils libseccomp lvm2 bridge-utils libtool-ltdl iptables rsync policycoreutils-python selinux-policy-base selinux-policy-targeted ntp ebtables bind-utils nmap-ncat openssl e2fsprogs redhat-lsb-core conntrack-tools socatYou can either manually install these packages now, or, allow Cloudera Manager to install them as part of the Prepare Node command later in this step.
If you choose the latter, make sure that Cloudera Manager has the permissions needed to install the required packages. To do so, go to the CDSW service and click Configuration. Search for the Install Required Packages property and make sure it is enabled.
- Go to the CDSW service.
- Click Instances.
- Select all the role instances.
- Select Actions for Selected > Prepare Node. This will install the required set of packages on all the new hosts.
Start the CDSW Service
- Log into the Cloudera Manager Admin Console.
- On the Start from the dropdown. tab, click to the right of the CDSW service and select
- Confirm your choice on the next screen. When you see a Finished status, the action is complete.
Migrating an RPM Deployment
This section describes how to migrate an RPM-based Cloudera Data Science Workbench service to a new set of gateway hosts.
- Add and Set Up the New Hosts
- Copy the JDK to the new host
- Copy the DNS Nameserver to the new host
- Copy the Kerberos Configurations
- Stop Cloudera Data Science Workbench
- Backup Application Data
- Remove Cloudera Data Science Workbench from Existing Hosts
- Move Backup to New Master
- Update DNS Records for the New Master
- Install Cloudera Data Science Workbench on New Master Host
Add and Set Up the New Hosts
-
Add new hosts to your cluster as needed. Make sure they are gateway hosts that have been assigned gateway roles for HDFS, YARN, and Spark 2. Do not run any other services on these hosts.
-
Set up the new hosts as per the Cloudera Data Science Workbench hardware requirements listed here.
- Disable Untrusted SSH Access on the new hosts.
- Configure Block Devices on the new hosts.
Copy the JDK to the new host
Copy the /usr/java directory to the new host.
Copy the DNS Nameserver to the new host
Copy the /etc/resolv.conf file to the new host.
Copy the Kerberos Configurations
Copy the /etc/jkr5.conf file to the new host.
Stop Cloudera Data Science Workbench
cdsw stop
Backup Application Data
In Cloudera Data Science Workbench all stateful data is stored on the master host at /var/lib/cdsw. Backup the contents of this directory before you begin the migration process.
- Stop Cloudera Data Science Workbench.
- After stopping CDSW, and before running the following tar command, wait 2-5 minutes (depending on your disk speed) to ensure that all data from CDSW is successfully written to the disks. Otherwise the tar command may not capture all recent changes.
- To create the backup, run the following command on the master host:
tar -cvzf cdsw.tar.gz -C /var/lib/cdsw/ .
Remove Cloudera Data Science Workbench from Existing Hosts
cdsw stop yum remove cloudera-data-science-workbench
Move Backup to New Master
tar xvzf cdsw.tar.gz -C /var/lib/cdsw
Update DNS Records for the New Master
Update your DNS records with the IP address for the new master host.
Install Cloudera Data Science Workbench on New Master Host
For instructions, see Installing Cloudera Data Science Workbench 1.8.1 Using Packages.