Creating a Cloudera AI Inference service instance on Cloudera Embedded Container Service

Cloudera AI Inference service is available only on Cloudera Embedded Container Service platform with Cloudera AI on premises. You can create a Cloudera AI Inference service by first generating the CLI input skeleton, customizing the JSON file, and then passing the file to the creation command.

The following example presents how to create a Cloudera AI Inference service by generating the CLI input skeleton, customizing the JSON file, and then passing the file to the creation command:

  1. Generate the JSON skeleton payload and save it to a file:

    $ cdp ml create-ml-serving-app --generate-cli-skeleton > /tmp/create-serving-app-input.json
  2. Customize the JSON file to use it when creating a Cloudera AI Inference service instance.
    {
        "appName": "my-aws-caii-cluster",
        "environmentCrn": "[***CDP-ENVIRONMENT-CRN***]",
        “clusterCrn”: “[***COMPUTE-CLUSTER-CRN***]”,
        "provisionK8sRequest": {
            "instanceGroups": [
                {
                    "instanceType": "m5.4xlarge",
                    "instanceTier": "ON-DEMAND",
                    "instanceCount": 1,
                    "name": "[***OPTIONAL-LEAVE BLANK***]",
                    "rootVolume": {
                        "size": 256
                    },
                    "autoscaling": {
                        "minInstances": 0,
                        "maxInstances": 5,
                        "enabled": true
                    }
                },
                {
                    "instanceType": "p4de.24xlarge",
                    "instanceCount": 1,
                    "rootVolume": {
                        "size": 1024
                    },
                    "autoscaling": {
                        "minInstances": 0,
                        "maxInstances": 5,
                        "enabled": true
                    }
                }
            ],
            "environmentCrn": "[***CDP-ENVIRONMENT-CRN***]",
            "tags": [
                {
                    "key": "experience",
                    "value": "cml-serving"
                }
            ]
        },
        "usePublicLoadBalancer": true,
        "skipValidation": false,
        "loadBalancerIPWhitelists": [
            ""
        ],
        "subnetsForLoadBalancers": [
            ""
        ],
        "staticSubdomain": "mydomain"
     }
  3. Use the JSON file created in the previous step to create the Cloudera AI Inference service instance:
    $ cdp ml create-ml-serving-app --cli-input-json file:///tmp/create-serving-app-input.json

    After a successful invocation of the create command, the CRN of the Cloudera AI Inference service instance that is created is displayed. The command adds the requested compute worker node groups to the existing Kubernetes cluster specified by the clusterCrn field in the request body, and installs the necessary software components.