Upgrading a Data Lake manually via CLI

You can initiate a Data Lake upgrade with the CDP CLI. Using the same CLI command, you can also search for and validate available images to upgrade to, and generate JSON templates for specific upgrade scenarios.

Obtain image ID

If your Data Lake upgrade includes upgrading from CentOS to RHEL 8, prior to attempting an upgrade you need to obtain an ID of a target RHEL 8 image. You can obtain it from the image catalog by finding an image with your target Runtime version which has an OS Type of RHEL8.

Once you have identified the ID, you can provide it in the upgrade CLI command by using the --image-id flag.

Upgrade steps

  1. Run the cdp datalake upgrade-datalake command. In order to use this command for upgrading from CentOS to RHEL, ensure to provide an image ID of a RHEL 8 image.

    The command has the following options:

    cdp datalake upgrade-datalake
              --datalake-name <value>
              [--image-id <value>]
              [--runtime <value>]
              [--lock-components | --no-lock-components]
              [--dry-run | --no-dry-run]
              [--show-available-images | --no-show-available-images]
              [--show-available-image-per-runtime | --no-show-available-image-per-runtime]
              [--skip-backup | --no-skip-backup]
              [--skip-ranger-hms-metadata | --no-skip-ranger-hms-metadata]
              [--skip-atlas-metadata | --no-skip-atlas-metadata]
              [--skip-ranger-audits | --no-skip-ranger-audits]
              [--skip-backup-validation | --no-skip-backup-validation]
              [--cli-input-json <value>]
              [--generate-cli-skeleton]
    Option Description
    --datalake-name (string) Required. The name or CRN of the Data Lake to upgrade.
    --image-id (string) The ID of an image to upgrade to. If upgrading from CentOS to RHEL, make sure to provide an image ID of a target RHEL image.
    --runtime (string) The Runtime version to upgrade to. When you specify the Runtime version, the upgrade uses the latest image ID of the given Runtime version from the same image catalog used for Data Lake creation.
    --lock-components | --no-lock-components (boolean) Use --lock components to perform an OS upgrade only.
    --dry-run | --no-dry-run (boolean) Checks the eligibility of an image to upgrade. Can be used in conjunction with any other parameter, returning the available image (with respect to image Id, Runtime or lock-components set) without performing any actions.
    --show-available-images | --no-show-available-images (boolean) Returns the list of images that are eligible to upgrade to.
    --show-available-image-per-runtime | --no-show-available-image-per-runtime (boolean) Returns the latest image that is eligible to upgrade to, for each Runtime version with at least one available upgrade candidate.
    --skip-backup | --no-skip-backup If provided, will skip the backup flow for the upgrade process.
    --skip-ranger-hms-metadata | --no-skip-ranger-hms-metadata Skips the backup of the databases backing HMS/Ranger services. Redundant if –skip-backup is included. If this option is not provided, the HMS/Ranger services are backed up by default.
    --skip-atlas-metadata | --no-skip-atlas-metadata Skips the backup of the Atlas metadata. Redundant if –skip-backup is included. If this option is not provided, the Atlas metadata is backed up by default.
    --skip-ranger-audits | --no-skip-ranger-audits Skips the backup of the Ranger audits. Redundant if –skip-backup is included. If this option is not provided, Ranger audits are backed up by default.
    --skip-backup-validation | --no-skip-backup-validation Skips the validation steps that run prior to the backup. Redundant if –skip-backup is included. If this option is not provided, the validations are performed by default.
    --cli-input-json (string) Performs service operation based on the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton. If other arguments are provided on the command line, the CLI values will override the JSON-provided values.
    --generate-cli-skeleton (boolean) Prints a sample input JSON to standard output. Note the specified operation is not run if this argument is specified. The sample input can be used as an argument for --cli-input-json.

    When you run the cdp datalake upgrade-datalake command to initiate an upgrade, you have one of three options:

    1. Specify one of either --image-id, --runtime, or --lockComponents, which makes an explicit choice of the exact image, Runtime (latest OS), or latest OS (same Runtime) for upgrade.
    2. Specify both --image-id and --lockComponents, which specifies an image and ensures the image represents an OS only upgrade.
    3. Specify none of the --image-id, --runtime, or --lockComponents parameters, which initiates a Runtime/CM upgrade to the latest compatible version and OS image.
    Outside of upgrade, you can use the following options:
    • --show-available-images/--no-show-available-images
    • --show-available-images-per-runtime/--no-show-available-images-per-run time
    • --dry-run
    Examples of valid inputs:
    cdp datalake upgrade-datalake --datalake-name my-datalake --dry-run
    cdp datalake upgrade-datalake --datalake-name my-datalake --image-id d1c520b1-987d-461f-7860-918f43994c04
    cdp datalake upgrade-datalake --datalake-name my-datalake --image-id d1c520b1-987d-461f-7860-918f43994c04 --dry-run
    cdp datalake upgrade-datalake --datalake-name my-datalake --runtime 7.2.11
    cdp datalake upgrade-datalake --datalake-name my-datalake --runtime 7.2.11 --dry-run
    cdp datalake upgrade-datalake --datalake-name my-datalake--lock-components
    cdp datalake upgrade-datalake --datalake-name my-datalake --show-available-image-per-runtime
    cdp datalake upgrade-datalake --datalake-name my-datalake- --show-available-images
    
    Examples of incorrect inputs:
    cdp datalake upgrade-datalake --datalake-name my-datalake --image-id 7.2.11
    cdp datalake upgrade-datalake --datalake-name my-datalake--runtime d1c520b1-987d-461f-7860-918f43994c04
    cdp datalake upgrade-datalake --datalake-name my-datalake--lock-components --imageid imageid --runtime runtime
    cdp datalake upgrade-datalake --datalake-name my-datalake --show-available-image-per-runtime --show-available-images
    cdp datalake upgrade-datalake --datalake-name my-datalake --show-available-image-per-runtime --dry-run
    cdp datalake upgrade-datalake --datalake-name my-datalake --show-available-images --dry-run