Uploading Model Repositories for an air-gapped environment
The Model artifacts must be manually transferred, uploaded to the cloud storage utilized by the Cloudera AI Registry and Cloudera AI Inference service.
Before you begin
You will need to obtain the data lake bucket or container information for your cloud provider to use as the destination for the model artifacts.
-
In the Cloudera console, click the Management Console tile.
-
Click Environments, then select your AWS environment.
-
On the Environment details page, click Summary.
-
Scroll down to the Logs Storage and Audit field and copy the storage location.
- Omit /logs from the location.
Example: If the log storage location is s3://datalakebucket/datalakeenv-dl/logs, the datalake bucket is s3://datalakebucket/datalakeenv-dl. The final destination for the model artifacts will be
s3://datalakebucket/datalakeenv-dl/modelregistry/secured-models
.
-
In the Cloudera console, click the Management Console tile.
-
Click Environments, then select your AWS environment.
-
On the Environment details page, click Summary.
-
Scroll down to the Logs Storage and Audit field and copy the storage location.
- Run the
import_to_airgap.py
script to upload the model artifacts to a secured location in your cloud environment.Run the script using the following command to upload the Model artifacts to a secured location.
python3.9 import_to_airgap.py -i -e <endpoint> -c <cloud_type> -s <source_directory> -d <destination> -ri <repository_id>
Example:python3.9 import_to_arigap.py -c aws -s $PWD/models -d s3://datalakebucket/datalakeenv-dl/modelregistry/secured-models -ri nim/meta/llama-3_1-70b-instruct:0.11.1+14957bf8-h100x4-fp8-throughput.1.2.18099809
You can use the following parameters for uploading the Models.
Table 1. Paramaters for uploading the Models Parameter Description Example -c
Cloud type (AWS, Azure) -c aws
-s
Must contain the previously downloaded Model artifacts as it is the source directory of the downloaded Model. -s $PWD/models
-d
Must point to the Cloudera AI Registry bucket with the appropriate path. The destination format must be: s3://bucket/secured-models
-d s3://bucket/secured-models
-rt
Repository type (Hugging Face or NVIDIA NGC) -rt hf
-ri
Repository ID of the Model downloaded to local filesystem -ri nim/meta/llama-3_1-70b-instruct: 0.11.1+14957bf8-h100x4-fp8-throughput. 1.2.18099809
Run the script using the following command to upload the Model artifacts to a secured location.
python3.9 import_to_airgap.py <endpoint> -c azure -s $PWD/models -d modelregistry/secured-models -ri <repository_id> --account $AZURE_STORAGE_ACCOUNT_NAME --container data
Example:python3.9 import_to_arigap.py https://ccycloud-5.cml-cai.root.comops.site:9879 -c azure -s $PWD/models -d modelregistry/secured-models -ri nim/meta/llama-3_1-70b-instruct:0.11.1+14957bf8-h100x4-fp8-throughput.1.2.18099809 --account datalakeaccount --container data
You can use the following parameters for uploading the Models.
Table 2. Paramaters for uploading the Models Parameter Description Example -c
Cloud type (AWS, Azure) -c zure
-s
Must contain the previously downloaded Model artifacts as it is the source directory of the downloaded Model. -s $PWD/models
-d
Must point to the Cloudera AI Registry bucket with the appropriate path. The destination format must be: s3://bucket/secured-models
-d s3://bucket/secured-models
-rt
Repository type (Hugging Face or NVIDIA NGC) -rt hf
-ri
Repository ID of the Model downloaded to local filesystem -ri nim/meta/llama-3_1-70b-instruct: 0.11.1+14957bf8-h100x4-fp8-throughput. 1.2.18099809
--account
Azure storage account name (Azure only) --account $AZURE_STORAGE_ACCOUNT_NAME
--container
Azure storage container name (Azure only) --container data