Known issues and limitations in Cloudera Data Engineering on CDP Private Cloud
This page lists the current known issues and limitations that you might run into while using the Cloudera Data Engineering (CDE) service.
- DEX-14676: Deep Analysis is not working in CDE PvC under analysis tab
- If you are using Spark version 2.x for running your jobs, then the Run Deep Analysis feature present under the Analysis tab is not supported on Cloudera Data Engineering Private Cloud.
- DEX-6743: CDE CLI command execution sometimes displays End of File (EOF) error message in the end.
- CDE CLI command execution sometimes displays an EOF error message in the end even though the command executes successfully. This generally happens due to error message or delay in response due to network issues or timeout error.
- DOCS-17844: Logs are lost if the log lines are longer than 50000 characters in fluentd
-
This issue occurs when the Buffer_Chunk_Size parameter for the fluent-bit is set to a value that is lesser than the size of the log line.
- DEX-8659 A non-functional Authoring UI field is displayed in the Airflow job creation page.
- If you are using the Default virtual cluster in CDP 1.4.1, you might see a new Authoring UI field on the airflow job creation page but it is not functional.
- OPSAPS-65424: Embedded Container Service (ECS) 1.3.4 to 1.4.1 control plane upgrade looping forever in error state
- Upgrading the ECS version while CDE service is enabled, can cause Control Pane upgrade looping forever in error state.
- DEX-8226: Grafana Charts of new virtual clusters will not be accessible on upgraded clusters if virtual clusters are created on existing CDE service.
- If you upgrade the cluster from 1.3.4 to 1.4.x and create a new virtual clusters on the existing CDE Service, Grafana Charts will not be displayed. This is due to broken APIs.
- DEX-7000: Parallel Airflow tasks triggered at exactly same time by the user throws the 401:Unauthorized error.
- Error 401:Unauthorized is displayed when parallel Airflow tasks in an airflow job are triggered or launched exactly at the same time by the user.
- DEX-7001: When Airflow jobs are run, the privileges of the user who created the job is applied and not the user who submitted the job.
- Irrespective of who submits the Airflow job, the Airflow job is run with the user privileges who created the job. This causes issues when the job submitter has lesser privileges than the job owner who has higher privileges.
- DEX-7022: Virtual Cluster does not accept spark or airflow jobs if the tzinfo library is used as the start date.
- If you use the tzinfo library for start_date, then the Virtual Cluster may not complete
execution of spark or airflow jobs launched later. For example:
example_dag = DAG( 'bashoperator-parameter-job', default_args=default_args, start_date=parser.isoparse("2020-11-11T20:20:04.268Z").replace(tzinfo=timezone.utc), schedule_interval='@once', is_paused_upon_creation=False )
- Changing LDAP configuration after installing CDE breaks authentication
- If you change the LDAP configuration after installing CDE, as described in Configuring LDAP authentication for CDP Private Cloud, authentication no longer works.
- Gang scheduling is not supported
- Gang scheduling is not currently supported for CDE on CDP Private Cloud.
- HDFS is the default filesystem for all resource mounts
- For any jobs that use local filesystem paths as
arguments to a Spark job, explicitly specify
file://
as the scheme. For example, if your job uses a mounted resource calledtest-resource.txt
, in the job definition, you would typically refer to it as/app/mount/test-resource.txt
. In CDP Private Cloud, this should be specified asfile:///app/mount/test-resource.txt
. - Apache Ozone is supported only for log files
- Apache Ozone is supported only for log files. It is not supported for job configurations, resources, and so on.
- Scheduling jobs with URL references does not work
- Scheduling a job that specifies a URL reference does not work.
Limitations
- Access key-based authentication will not be enabled in upgraded clusters prior to CDP PVC 1.3.4 release.
- After you upgrade to PVC 1.3.4 version from earlier versions, you must create the CDE Base service and Virtual Cluster again to use the new Access Key feature. Otherwise, the Access Key feature will not be supported in the CDE Base service created prior to the 1.3.4 upgrade.