Resizing the Data Lake through the CDP UI

You can resize a Data Lake from light or medium duty to medium duty or enterprise through the CDP UI.

Required role: EnvironmentAdmin or Owner of the environment

Cloudera Manager configurations are not retained when a Data Lake is resized (they are lost when a new Data Lake cluster is created as part of backup and restore operation). Therefore, prior to performing a resize you should note all the custom Cloudera Manager configurations of your Data Lake and then once the resizing operation is completed, reapply them.
  1. Stop all of the attached Data Hub clusters that can be stopped, to make sure that there are no changes to HMS metadata during the resizing operation. For any cluster that cannot be stopped, stop all of the services on the Data Hub through the Cloudera Manager UI.
  2. Verify that the DATALAKE_ADMIN_ROLE, RANGER_AUDIT_ROLE, and LOG_ROLE have read/write permissions to the backup location. See the Data Lake backup and restore documentation for more information on these permissions. LOG_ROLE is specific to Data Lake restore.
  3. In the CDP UI, click Data Lakes and select the Data Lake that you want to resize.
  4. Click Resize.

    You will be asked to confirm that you want to resize the Data Lake, after which the resizing process will begin. The resizing operation is finished when the Data Hub clusters have been automatically refreshed, which happens after the original Data Lake has been deleted. Check the Event History to verify that the Data Hubs have been refreshed.

  5. RAZ-enabled Data Lakes are currently eligible for automatic restore during a resizing operation only if you are resizing:
    • An AWS Data Lake on Cloudera Runtime version 7.2.15+
    • An Azure Data Lake on Cloudera Runtime version 7.2.16+
    For older Runtime versions, the Data Lake will be automatically backed up, but you must manually restore the Data Lake after the resizing is complete. If RAZ is in use on a Runtime version that is ineligible for automatic restore, before you start the Data Lake backup, make sure that the restore_to_raz policy Ranger policy exists with access to the backup location in the cloud. See instructions for manually restoring a RAZ-enabled Data Lake here.