Known Issues and Limitations

This section lists the known issues and limitations specific to Cloudera Data Science Workbench.

AMPs

  • AMPs display warning message even though engine:15 is present. For AMPs, the UI might show warning message even though engine:14 or engine:15 is present as the supported engine for the AMP. You can disregard this message if either of these engines are selected.

    Cloudera Bug: DSE-19086

  • AMP creation step might be reported as succeeded, even if a requirement installation failed. Such installation steps might fail due to intermittent network issues.

    Workaround: Check and rerun related project creation steps.

    Cloudera Bug: DSE-17966

  • Certain AMPs might not provide default ML Runtimes on the project configuration page. In such cases Cloudera recommends using Python 3.7 as a kernel. Later python kernels might not be supported due to the libraries used (for example, torch 1.6.0 does not support python 3.9).

    Cloudera Bug: DSE-17965

API v2

  • Limitation: search_filter is not being applied on Apiv2/python calls.

    Cloudera Bug: DSE-18804

  • API v2 Limitation: Custom project templates created by an Administrator are not supported by API v2.

    Cloudera Bug: DSE-18758

  • API v2 Limitation: Jobs configured via API v2 can only be set to send email notifications on success events.

    Cloudera Bug: DSE-18754

  • API v2 known issue: Job creation with invalid fields can cause the created client to fail running further and provide 502 responses going forward.

    Workaround: Stop the code execution and recreate the client.

    Cloudera Bug: DSE-18753

  • API v2 limitation: Using API v2 workloads can be created to use Python2 kernels, even though site administration has disabled python2 support on the UI.

    Cloudera Bug: DSE-18744

  • API v2 swagger: Authorization header is not passed to some endpoint calls

    Cloudera Bug: DSE-18703

  • Projects deleted by APIv2 are not automatically garbage collected.

    Workaround: Mark every old deleted project for garbage collection.

    Cloudera Bug: DSE-21888

Jobs

  • External emails mentioned in Job settings, vanish every time the job is paused and resumed.

    Cloudera Bug: DSE-18987

  • Environment Variables values are only visible in sessions, applications, jobs, and models that you launched. You can use these environment variables to securely store confidential information such as your AWS or database credentials.

    Cloudera Bug: DSE-19066

  • MLops Governance is not working - Atlas/Kafka/CDSW configuration issue.

    Cloudera Bug: DSE-18892

  • Workers which cannot be scheduled because of pod quota limit seem to be hang around and consuming resources.

    Cloudera Bug: DSE-18479

  • Job notification emails are sent inconsistently. Successfully sent emails depend on which type of notifications are selected.
    Workaround:
    1. Uncheck all status boxes for the internal user and team owner.

    2. Add your email as an external email.
    3. Check all status boxes (Success, Failure, Stopped, Timeout) together for external email.
    4. Use your email client to filter out unwanted notifications

    Cloudera Bug: DSE-23003

Models

  • During build time, the model/experiment does not have access to user level environment variables.

    Workaround: Add user level environment variables on the Administrative/Project level instead.

    Cloudera Bug: DSE-19067

  • Model creation fails with ambiguous "Failed to create model" message when there is already a model with the same name. Issue also applies when using APIv2 directly.

    Cloudera Bug: DSE-7509

  • When viewing projects, if you select a team context you'll still only see your own projects.

    Cloudera Bug: DSE-22656

SAML

  • Users are unable to login to CDSW 1.10.1 via the SAML 2.0-based Single Sign-On workflow in the browser, if the CDSW cluster is configured to use SAML 2.0 for authentication. For CDSW 1.10.1 configured with LDAP or local authentication, users are able to login to CDSW, as expected.

    Cloudera Bug: DSE-22682

Technical Service Bulletins

TSB 2023-628: Sensitive user data getting collected in CML/CDSW workspace diagnostic bundles

When using Cloudera Data Science Workbench (CDSW), Cloudera recommends users to store sensitive information, such as passwords or access keys, in environment variables rather than in the code. See Engine Environment Variables in the official Cloudera documentation for details. Cloudera recently learned that all session environment variables in the affected releases of CDSW and CML are logged in web pod logs, which may be included in support diagnostic bundles sent to Cloudera as part of support tickets.

Severity:
  • Medium
Component affected:
  • CDSW
  • CML workspaces on Public Cloud
  • CML workspaces on Private Cloud
Products affected:
  • Cloudera Machine Learning
Releases affected:
  • CDSW 1.10.1 and lower
  • CML workspaces on Public Cloud 2.0.32-b117 and lower
  • CML workspaces on Private Cloud 1.4.0 and lower
Users affected:
  • CML workspace users who are storing sensitive data like DB passwords or secrets as environment variables in the product.
  • CML workspace users who are setting WORKLOAD_PASSWORD in CML Public Cloud workspaces from User Settings > Environment Variables > WORKLOAD_PASSWORD.
Impact:
  • Session environment variables in CML workspaces (Private Cloud, Public Cloud or OnPrem CDSW) are getting logged and collected when Admin generates diagnostic bundles from CDSW/CML Workspace > Site Administration > Support > Generate Log Archives. These logs are typically sent to Cloudera as part of support cases.
  • On Public Cloud, CML workspace service (web) logs, which are the source of the diagnostic bundles, also get stored on customer S3 or ADLS storage.
Action required: Upgrade CDSW/CML workspace
  • Upgrade to CDSW 1.10.2 version or higher.
  • Public Cloud
    • Upgrade Public Cloud CML workspaces to the latest release.
    • Delete the CML workspace service logs from S3 or ADLS storage.
      • Find the Storage Location for “Logs Storage and Audits” in the Environment service details page. (say <datalake_logs_path>)
      • Find the Cluster Name from the CML Workspace details page. (say <cluster_name>)
      • Now please delete all the files under <datalake_logs_path>/<cluster_name> folder that are generated before CML workspace is upgraded to the latest with this fix.
  • Private Cloud
    • Upgrade Private Cloud CML workspace to 1.4.1 or higher
Addressed in release/refresh/patch:
  • CDSW 1.10.2
  • Public Cloud 2.0.32-b123
  • Private Cloud 1.4.1
Knowledge article
For the latest update on this issue see the corresponding Knowledge article: TSB-2023-628: Sensitive user data getting collected in CML/CDSW workspace diagnostic bundles