Scaling the Data Lake through the CDP UI
You can scale a Data Lake from light to medium duty through the CDP UI.
Required role: EnvironmentAdmin or Owner of the environment
- Stop all of the attached Data Hub clusters that can be stopped, to make sure that there are no changes to HMS metadata during the scaling operation. For any cluster that cannot be stopped, stop all of the services on the Data Hub through the Cloudera Manager UI.
- Verify that the DATALAKE_ADMIN_ROLE, RANGER_AUDIT_ROLE, and LOG_ROLE have read/write permissions to the backup location. See the Data Lake backup and restore documentation for more information on these permissions. LOG_ROLE is specific to Data Lake restore.
- In the CDP UI, click Data Lakes and select the Data Lake that you want to scale.
- Click Resize.
You will be asked to confirm that you want to resize the Data Lake, after which the scaling process will begin. The scaling operation is finished when the Data Hub clusters have been automatically refreshed, which happens after the original light duty Data Lake has been deleted. Check the Event History to verify that the Data Hubs have been refreshed.
- When the scaling operation is finished, if you are upscaling a RAZ-enabled Data
Lake, manually restore the Data Lake from the
backup taken by the scaling process. To restore, you will need to add the restore_to_raz
policy in Ranger and then run the
cdp datalake restore-datalake
command in the CDP CLI.