Prerequisites for setting up Cloudera AI Inference service
Several prerequisites must be considered before setting up Cloudera AI Inference service.
Cloudera Manager supported versions
JSON Web Token-based authentication from Cloudera Control Plane to Cloudera AI Inference service requires Cloudera Manager 7.13.1 CHF3 or higher versions.
LDAP authentication
User authentication is performed by the Knox service running on Cloudera AI Inference service, which relies on the LDAP configuration defined in the Cloudera control plane. Without this LDAP integration, access to APIs and model endpoints is denied.
Storage credentials
Cloudera AI Inference service requires read-only storage credentials to download models. Alternatively, you can also reuse the same storage credentials configured for Cloudera AI Registry.
The storage credentials certificate must be uploaded to the control plane, for more details, see Updating TLS certificates.
In Cloudera AI on premises 1.5.5, Ozone is the storage provider. Both Ozone and Cloudera AI Inference service must reside within the same Cloudera Manager, as Ozone certificates are dynamically retrieved from the base cluster during Cloudera AI Inference service provisioning.
In Cloudera AI on premises 1.5.5 SP1, S3 compatible storage services are supported. A credential fallback mechanism -if enabled- automatically retrieves credentials for S3 compatible storage services from the Cloudera AI Registry when they are not explicitly provided during Cloudera AI Inference service provisioning.
To auto-populate credentials for S3 compatible storage services from the Cloudera AI Registry to Cloudera AI Inference service in Command Line Interface, all fields must be empty for the credentials of the S3 compatible storage services. The system automatically detects the empty fields and picks up the Cloudera AI Registry credentials for S3 compatible storage services.
To auto-populate credentials for S3 compatible storage services from the Cloudera AI Registry to Cloudera AI Inference service in the UI, the Use AI Registry Storage Credentials checkbox must be selected, when creating the Cloudera AI Inference service.
If the Use AI Registry Storage Credentials checkbox is selected, the section for providing credentials for S3 compatible storage services is hidden and the credentials are populated from the Cloudera AI Registry.
Setting up GPU nodes for Cloudera AI Inference service
- Assign taint to the node on Cloudera Embedded Container Service. For instructions see Setting up the GPU node.
- Install the nvidia-container-toolkit on the worker node. For more details, see GPU nodes setup as worker nodes.
- Install the NVIDIA driver and container runtime. For instruction see Installing NVIDIA GPU software in ECS.
Dedicated TLS certificate for Cloudera AI Inference service
Cloudera AI Inference service requires dedicated TLS certificates as it operates through a separate Istio gateway, which does not support shared certificates. During the installation of Cloudera on premises, the root Certificate Authority (CA) for these certificates must be set up. If this configuration step was missed, you can update the root CA by following the guidelines outlined in: Updating TLS certificates.
Cloudera AI Registry
A Cloudera AI Registry must first be deployed in the same Cloudera environment in which you plan to deploy Cloudera AI Inference service. That is, Cloudera AI Inference service can only deploy models registered to a Cloudera AI Registry in the same Cloudera environment. Cloudera AI Registry 1.5.5 or a higher version must be created before provisioning the Cloudera AI Inference service. If you have an existing Cloudera AI Registry in the environment, you must first upgrade it to Cloudera 1.5.5 or a higher version before provisioning Cloudera AI Inference service.
