Known Issues
You might run into some known issues while using Cloudera Private Cloud.
- Using dollar character in environment variables in Cloudera AI
Environment variables with the dollar ($) character are not parsed correctly by Cloudera AI. For example, if you set
PASSWORD="pass$123"
in the project environment variables, and then try to read it using the echo command, the following output will be displayed:pass23
Workaround: Use one of the following commands to print the $ sign:echo 24 | xxd -r -p or echo JAo= | base64 -d
Insert the value of the environment variable by wrapping it in the command substitution using $() or ``. For example, if you want to set the environment variable toABC$123
, specify:ABC$(echo 24 | xxd -r -p)123 or ABC`echo 24 | xxd -r -p`123
- DSE-37827: Jupyter's RTC extension throws an error and notebooks become unusable
-
In certain cases, Jupyter’s RTC (Real Time Collaboration) extension may cause errors claiming either that other sessions are active, or that other processes have accessed the notebook files. After these errors, the notebook becomes unusable due to the error messages and the Cloudera AI session needs to be restarted.
Workaround:
You must disable the Jupyter RTC extension by performing the following tasks:- Create a Session.
- Open the terminal.
- Enter nano /home/cdsw/.jupyter/labconfig/page_config.json.
- Add the following lines to the file:
{ "disabledExtensions": { "@jupyter/collaboration-extension": true }, "lockedExtensions": { "@jupyter/collaboration-extension": true } }
- Save and close the file.
- DSE-36718: Disable auto synchronization feature for users and teams
-
The automated team and user synchronization feature is disabled. Newly installed or upgraded workbenches do not have the automatic synchronization option in the Cloudera AI UI.
Workaround: none
- DSE-36759: AMPs and Feature Announcement sections do not work in NTP setups
-
Cloudera AI Private Cloud setups with Non Transparent Proxy do not function properly, that affects Cloudera Accelerators for Machine Learning Projects and Feature Announcements. The home page freezes, the feature announcement displays error message, and the AMPs do not load.
Workaround:
To avoid the home page freeze copy the following environment variables from the web deployment, and add them to the environment section of the API deployments:- HTTP_PROXY
- HTTPS_PROXY
- NO_PROXY
- http_proxy
- https_proxy
- no_proxy
- DSE-32943: Enabling Service Accounts
- Teams in the Cloudera AI Workbench can only run workloads within team projects with the Run as option for service accounts if they have previously manually added service accounts as a collaborator to the team.
- DSE-35013: First Cloudera AI Workbench creation fails
-
On RHEL 8.8, during the first Cloudera AI Workbench installation on GPU with Cloudera Embedded Container Service external registry, pods might get stuck in the init or CrashLoop state.
First-time workbench installation is expected to fail. Consider this as a test workbench, and apply the following manual workaround for creating subsequent workbenches:- Restart or delete the pods which are in init or CrashLoop state in the test workbench.
- Once all pods are in the running state, create new workbenches as needed.
- Delete the test workbench from the Cloudera AI UI if no longer needed.
- OPSX-4603: Buildkit in Cloudera Embedded Container Service in Cloudera AI Private Cloudd
-
Issue: BuildKit was introduced in Cloudera Embedded Container Service for building images of models and experiments. BuildKit is a replacement for Docker, which was previously used to build images of Cloudera AI's models and experiments in Cloudera Embedded Container Service. Buildkit is only for OS RHEL8.x and CentOS 8.x.
Buildkit in Cloudera AI Private Cloud 1.5.2 is a Technical Preview feature. Hence, having Docker installed on the nodes/hosts is still mandatory for models and experiments to work smoothly. Upcoming release will be completely eliminating the dependency of Docker on the nodes.
Workaround: None.
- DSE-32285: Migration: Migrated models are failing due to image pull errors
-
Issue: After CDSW to Cloudera AI migration (on-premises) via full-fledged migration tool, migrated models on Cloudera AI Workbench on Private Cloud fails on initial deployment. This is because the initial model deployment tries to pull images from on-premises's registry.
Workaround: Redeploy the migrated model. As this involves the build and deploy process, the image will be built, pushed to the Private Cloud Cloudera AI Workbench's configured registry, and then the same image will be consumed for further usage.
- DSE-28768: Spark Pushdown is not working with Scala 2.11 runtime
-
Issue: Scala and R are not supported for Spark Pushdown.
Workaround: None.
- DSE-32304: On Cloudera AI Private Cloud Cloudera Embedded Container Service terminal and ssh connections can terminate
-
Issue: In Cloudera Private Cloud Cloudera Embedded Container Service, Cloudera AI Terminal and SSH connections can terminate after an uncertain amount of time, usually after 4-10 minutes. This issue affects the usage of local IDEs to work with Cloudera AI, as well as any customer application using a websocket connection.
Workaround: None.
- DSE- 35251: Web pod crashes if a project forking takes more than 60 minutes
-
The web pod crashes if a project forking takes more than 60 minutes. This is because the timeout is set to 60 minutes using the grpc_git_clone_timeout_minutes property. The following error is displayed after the web pod crash:
2024-04-23 22:52:36.384 1737 ERROR AppServer.VFS.grpc crossCopy grpc error data = [{"error":"1"},{"code":4,"details":"2","metadata":"3"},"Deadline exceeded",{}] ["Error: 4 DEADLINE_EXCEEDED: Deadline exceeded\n at callErrorFromStatus (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/call.js:31:19)\n at Object.onReceiveStatus (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/client.js:192:76)\n at Object.onReceiveStatus (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:360:141)\n at Object.onReceiveStatus (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:323:181)\n at /home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/resolving-call.js:94:78\n at process.processTicksAndRejections (node:internal/process/task_queues:77:11)\nfor call at\n at ServiceClientImpl.makeUnaryRequest (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/client.js:160:34)\n at ServiceClientImpl.crossCopy (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/make-client.js:105:19)\n at /home/cdswint/services/web/server-dist/grpc/vfs-client.js:235:19\n at new Promise (<anonymous>)\n at Object.crossCopy (/home/cdswint/services/web/server-dist/grpc/vfs-client.js:234:12)\n at Object.crossCopy (/home/cdswint/services/web/server-dist/models/vfs.js:280:38)\n at projectForkAsyncWrapper (/home/cdswint/services/web/server-dist/models/projects/projects-create.js:229:19)"] node:internal/process/promises:288 triggerUncaughtException(err, true /* fromPromise */); ^Error: 4 DEADLINE_EXCEEDED: Deadline exceeded at callErrorFromStatus (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/call.js:31:19) at Object.onReceiveStatus (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/client.js:192:76) at Object.onReceiveStatus (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:360:141) at Object.onReceiveStatus (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:323:181) at /home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/resolving-call.js:94:78 at process.processTicksAndRejections (node:internal/process/task_queues:77:11) for call at at ServiceClientImpl.makeUnaryRequest (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/client.js:160:34) at ServiceClientImpl.crossCopy (/home/cdswint/services/web/node_modules/@grpc/grpc-js/build/src/make-client.js:105:19) at /home/cdswint/services/web/server-dist/grpc/vfs-client.js:235:19 at new Promise (<anonymous>) at Object.crossCopy (/home/cdswint/services/web/server-dist/grpc/vfs-client.js:234:12) at Object.crossCopy (/home/cdswint/services/web/server-dist/models/vfs.js:280:38) at projectForkAsyncWrapper (/home/cdswint/services/web/server-dist/models/projects/projects-create.js:229:19) { code: 4, details: 'Deadline exceeded', metadata: Metadata { internalRepr: Map(0) {}, options: {} } }
Workaround: Increase the timeout limit, for example, to 120 minutes, using the grpc_git_clone_timeout_minutes property.UPDATE site_config SET grpc_git_clone_timeout_minutes = <new value>;