Known issues with Cloudera AI Registry standalone API
These are some of the known issues you might run into while using Cloudera AI Registry standalone API.
- NGC model download timeout
-
The NGC model import might time out, and the corresponding model version status is shown as “failed”. You can access the logs found in the API v2 pod by performing the steps mentioned in the Debugging the model import failure troubleshooting section.
- Model import failure
- You can download the models concurrently only if their combined size is below approximately 400 GB. Exceeding this limit may result in import failures and unexpected behavior.
- Request Throttling
- Currently, there is no request throttling mechanism implemented. As a result, excessive concurrent requests may lead to model import failures. To minimize the risk, it is recommended to limit concurrent requests to a maximum of 5, which is considered a safe threshold.
- Model Import progress indicator
- A progress bar is not available for model imports. For reference, importing a 70 GB model typically takes approximately 1 hour. Users should plan accordingly and monitor the process through alternative options, if necessary.