Creating a Cluster on AWSPDF version

Custom images and image catalogs

If necessary, you can use a custom Cloudera Runtime or FreeIPA image for compliance or security reasons. You can then use the Cloudera CLI to register a custom image catalog and set the custom image within the custom image catalog. Later, you can use this custom image to create a Data Lake, Cloudera Data Hub cluster or environment with a custom FreeIPA image.

A custom image should inherit most of its attributes from its source image, which is a default image that you select from the cdp-default image catalog.

The typical method of creating a Data Lake or Cloudera Data Hub picks up the latest pre-warmed image from the cdp-default image catalog for the specified version of Cloudera Runtime. These default images are pre-warmed VM images that contain a base URL to the default parcels in the Cloudera archive, amongst other configurations. If the default pre-warmed images do not suit your business needs, you can specify that the Data Lake, Cloudera Data Hub cluster or the environment (in the case of FreeIPA) uses a custom image instead.

A custom image is an entry in a custom image catalog that inherits most of its attributes from a source (default) image.

Custom image entries have:

  • An image type: Cloudera Runtime [which includes Cloudera Data Hub and Data Lake images] or FreeIPA
  • A source image ID that points to an image in the cdp-default image catalog
  • A timestamp of creation
  • An option to specify a VM region and image reference (such as an AMI ID) if you are overriding the source image with a custom VM image
  • An option to override the parcel base URL

You might require a custom image for compliance or security reasons (a “hardened” image), or to have your own packages pre-installed on the image, for example monitoring tools or software. You might also want to specify a custom image if you need to use a default image with a specific Runtime maintenance version applied, rather than simply specifying the latest major Runtime version.

In a custom image entry, you can override the VM images themselves with your own custom images that are sufficiently hardened. Importantly, you should only customize a default image from the cdp-default catalog as opposed to creating one from scratch. You can also override the default parcel base URL (at archive.cloudera.com) with your own host site.

A custom image catalog is simply a catalog that holds custom images. A custom image catalog can contain a single or multiple custom image entries.

Custom image catalogs have:

  • A name. The name is a unique identifier and is used to refer to the catalog during environment, Data Lake, and Cloudera Data Hub cluster creation; as well as during catalog operations like creating an image.
  • A description.
  • An owner. The owner is the user who runs the command to create the catalog.
  • If you are replacing the VM images in a custom image entry with a customized version, you should first prepare the image by modifying an official Cloudera default image, which you can find under Shared Resources > Image Catalogs > cdp-default.
  • Select a source image from the cdp-default image catalog to be the source of customization. When you run the CLI command to find a default image, you specify the Runtime version, provider, image type, or a combination of the three.
  • Create a custom image catalog, or identify an existing catalog where you want to save the custom image entry.
  • Apply the necessary changes to the custom image entry, like the override AMI IDs with the new, customized AMIs; or add a new parcel base URL using the --base-parcel-url command when you set the custom image.
  • You can then create an environment, Data Lake, or Cloudera Data Hub cluster, based on custom catalogs via the CDP CLI.

We want your opinion

How can we improve this page?

What kind of feedback do you have?