Prerequisites for setting up Cloudera AI Inference service
Consider the following prerequisites before setting up Cloudera AI Inference service.
Cloudera Manager supported versions
JSON Web Token based authentication from Cloudera Control Plane to Cloudera AI Inference service requires Cloudera Manager version 7.13.1 CHF3 or above.
LDAP Authentication
User authentication is performed by the Knox service running on Cloudera AI Inference service, which relies on the LDAP configuration defined in the Cloudera control plane. Without this LDAP integration, access to APIs and model endpoints is denied.
Ozone Credentials
Ozone service must be available on the Cloudera base cluster. Cloudera AI Inference service requires read-only Ozone S3 credentials to access Ozone for model downloads. Use the same Ozone credentials that you have created for using Cloudera AI Registry. Both Ozone and Cloudera AI Inference service must reside within the same Cloudera Manager, as Ozone certificates are dynamically retrieved from the base cluster during Cloudera AI Inference service provisioning.
Setting up GPU nodes for Cloudera AI Inference service
- Assign taint to the node on Cloudera Embedded Container Service, for details see Setting up the GPU node.
- Install the nvidia-container-toolkit on the worker node. For more details, see GPU nodes setup as worker nodes.
- Install the NVIDIA driver and container runtime, for details see Installing NVIDIA GPU software in ECS.
Dedicated TLS certificate for Cloudera AI Inference service
Cloudera AI Inference service requires dedicated TLS certificates as it operates through a separate Istio gateway, which does not support shared certificates. During the installation of Cloudera on premises, the root Certificate Authority (CA) for these certificates must be set up. If this configuration step was missed, you can update the root CA by following the guidelines outlined in: Updating TLS certificates.
Cloudera AI Registry
A Cloudera AI Registry must first be deployed in the same Cloudera environment in which you plan to deploy Cloudera AI Inference service. That is, Cloudera AI Inference service can only deploy models registered to a Cloudera AI Registry in the same Cloudera environment. For this release, Cloudera AI Registry version 1.5.5 or higher version must be created before provisioning the Cloudera AI Inference service. If you have an existing Cloudera AI Registry in the environment, you must first upgrade it to Cloudera 1.5.5 or higher version before provisioning Cloudera AI Inference service.