Uploading Model Repositories for an air-gapped environment

The Model artifacts must be manually transferred, uploaded to the cloud storage utilized by the Cloudera AI Registry and Cloudera AI Inference service.

Before you begin

You will need to obtain the data lake bucket or container information for your cloud provider to use as the destination for the model artifacts.

  1. In the Cloudera console, click the Management Console tile.

  2. Click Environments, then select your AWS environment.

  3. On the Environment details page, click Summary.

  4. Scroll down to the Logs Storage and Audit field and copy the storage location.
  5. Omit /logs from the location.

    Example: If the log storage location is s3://datalakebucket/datalakeenv-dl/logs, the datalake bucket is s3://datalakebucket/datalakeenv-dl. The final destination for the model artifacts will be s3://datalakebucket/datalakeenv-dl/modelregistry/secured-models.

  1. In the Cloudera console, click the Management Console tile.

  2. Click Environments, then select your AWS environment.

  3. On the Environment details page, click Summary.

  4. Scroll down to the Logs Storage and Audit field and copy the storage location.
    Example: If the log storage location is data@datalakeaccount.dfs.core.windows.net, the container name is data, and the account name is datalakeaccount. You will need this information for the --account and --container parameters when running the upload script.
  1. Run the import_to_airgap.py script to upload the model artifacts to a secured location in your cloud environment.

    Run the script using the following command to upload the Model artifacts to a secured location.

    python3.9 import_to_airgap.py -i -e <endpoint> -c <cloud_type> -s <source_directory> -d <destination> -ri <repository_id>
                            
    Example:
    python3.9 import_to_arigap.py -c aws -s $PWD/models -d s3://datalakebucket/datalakeenv-dl/modelregistry/secured-models -ri nim/meta/llama-3_1-70b-instruct:0.11.1+14957bf8-h100x4-fp8-throughput.1.2.18099809

    You can use the following parameters for uploading the Models.

    Table 1. Paramaters for uploading the Models
    Parameter Description Example
    -c Cloud type (AWS, Azure) -c aws
    -s Must contain the previously downloaded Model artifacts as it is the source directory of the downloaded Model. -s $PWD/models
    -d Must point to the Cloudera AI Registry bucket with the appropriate path.

    The destination format must be: s3://bucket/secured-models

    -d s3://bucket/secured-models
    -rt Repository type (Hugging Face or NVIDIA NGC) -rt hf
    -ri Repository ID of the Model downloaded to local filesystem
    -ri 
    nim/meta/llama-3_1-70b-instruct:
    0.11.1+14957bf8-h100x4-fp8-throughput.
    1.2.18099809

    Run the script using the following command to upload the Model artifacts to a secured location.

    python3.9 import_to_airgap.py <endpoint> -c azure -s $PWD/models -d modelregistry/secured-models -ri <repository_id> --account $AZURE_STORAGE_ACCOUNT_NAME --container data
                                
    Example:
    python3.9 import_to_arigap.py https://ccycloud-5.cml-cai.root.comops.site:9879 -c azure -s $PWD/models -d modelregistry/secured-models -ri nim/meta/llama-3_1-70b-instruct:0.11.1+14957bf8-h100x4-fp8-throughput.1.2.18099809 --account datalakeaccount --container data

    You can use the following parameters for uploading the Models.

    Table 2. Paramaters for uploading the Models
    Parameter Description Example
    -c Cloud type (AWS, Azure) -c zure
    -s Must contain the previously downloaded Model artifacts as it is the source directory of the downloaded Model. -s $PWD/models
    -d Must point to the Cloudera AI Registry bucket with the appropriate path.

    The destination format must be: s3://bucket/secured-models

    -d s3://bucket/secured-models
    -rt Repository type (Hugging Face or NVIDIA NGC) -rt hf
    -ri Repository ID of the Model downloaded to local filesystem
    -ri 
    nim/meta/llama-3_1-70b-instruct:
    0.11.1+14957bf8-h100x4-fp8-throughput.
    1.2.18099809
    --account Azure storage account name (Azure only) --account $AZURE_STORAGE_ACCOUNT_NAME
    --container Azure storage container name (Azure only) --container data