Fixed Issues in Spark
Review the list of Spark issues that are resolved in Cloudera Runtime 7.1.4.
- CDPD-12541: If an insert statement specifies partition statically and dynamically, there is a chance for data loss. To prevent data loss, Spark service displays an exception error.
- This issue is now resolved. Spark allows you to specify a mix of static and dynamic partitions in an insert statement.
- CDPD-14203: Introduced the spark.executor.rpc.bindToAll property to support multihoming when the new configuration is set to true bind to 0.0.0.0. The configuration defaults to false.
- This issue is now resolved.
- CDPD-10532: Update log4j to address CVE-2019-17571
- Replaced log4j with an internal version to fix CVE-2019-17571.
- CDPD-10515: Incorrect version of jackson-mapper-asl
- Use an internal version of jackson-mapper-asl to address CVE-2017-7525.
- CDPD-7882: If an insert statement specifies partitions both statically and dynamically, there is a potential for data loss
- To prevent data loss, this fix throws an exception if partitions are specified both statically and dynamically. You can follow the workarounds provided in the error message.
- CDPD-15773: Spark/Hive interaction causes deadlock.
- In the previous versions, applications that share a Spark Session across multiple threads experienced a deadlock accessing the HMS. This issue is now resolved.
- CDPD-14906: Spark reads or writes TIMESTAMP data for values before the start of the Gregorian calendar. This happens when Spark is:
- Using dynamic partition inserts.
- Reading or writing from an ORC table when spark.sql.hive.convertMetastoreOrc=false (the default is true).
- Reading or writing from an Orc table when spark.sql.hive.convertMetastoreOrc=true but spark.sql.orc.impl=hive (the default is native).
- Reading or writing from a Parquet table when spark.sql.hive.convertMetastoreParquet=false (the default is true).