General known issues with Cloudera Data Engineering
Learn about the general known issues with the Cloudera Data Engineering (CDE) service on public clouds, the impact or changes to the functionality, and the workaround.
- DEX-3997 : Python jobs using virtual environment fail with import error
- Running a Python job that uses a virtual environment resource fails with an import error, such as:
Traceback (most recent call last): File "/tmp/spark-826a7833-e995-43d2-bedf-6c9dbd215b76/app.py", line 3, in <module> from insurance.beneficiary import BeneficiaryData ModuleNotFoundError: No module named 'insurance'
- Workaround: Do not set the
spark.pyspark.driver.pythonconfiguration parameter when using a Python virtual environment resource in a job.
- DEX-2239 : Internal PyPI mirrors behind VPNs are not supported
- Creating a Python virtual environment resource in CDE fails if you are using an internal PyPI mirror behind a VPN.
- Workaround: Make sure your internal PyPI mirror is accessible without a VPN, or use the public PyPI repository.