Configuring AWS on-demand capacity reservations and capacity blocks

Cloudera AI supports AWS on-demand capacity reservations and capacity blocks to ensure instance availability for critical workloads.

On-Demand capacity reservations: These allow you to reserve EC2 capacity for a specific instance type in a particular Availability Zone without a long-term commitment.

Capacity blocks: These are specialized reservations designed for short, fixed-duration, high-performance workloads, primarily for GPU-based instances.

  1. Create the Reservation in AWS
    1. In the AWS Console, create a Capacity Reservation or Capacity Block.
    2. Select the required instance type in the region where your Cloudera AI application will be hosted.
    3. Note the Capacity Reservation ID.
    For detailed instructions on creating these in the AWS environment, see Create a Capacity Reservation.
    .
  2. Add a GPU Node Group in Cloudera AI:
    When creating or editing a Cloudera AI Inference service instance, perform the following:
    1. Add a GPU node group.
    2. Select the Instance Type that matches your AWS reservation.
    3. Enter the Capacity Reservation ID.
    4. Select the Subnet associated with the cluster where the application is being created.
      .
  3. Verification
    Once the application is created, verify the association by checking the Launch Template page on the Amazon EC2 console dashboard. Ensure the Capacity Reservation Target matches your ID.

    For more information about checking Capacity Reservation Target, see Amazon EC2 launch template.

    .