Fixed Issues in Apache Oozie

Review the list of Oozie issues that are resolved in Cloudera Runtime 7.2.10.

CDPD-20444: The Sqoop build no longer shades the Avro and Parquet libraries as it wasn't needed for a long time.
Oozie now automatically pulls in the necessary avro and parquet libraries into Oozie's Sqoop sharelib. If Avro or Parquet is used with Sqoop with an Oozie Sqoop action then you need not copy these libraries to sharelib manually. This issue is now resolved.
CDPD-21032: Memory leak in EL evaluation.
This issue is now resolved.
CDPD-19684: Oozie Spark action must support automatically copying the hive-site.xml
Oozie will now automatically pick-up the hive-site.xml and add it to the Yarn container of a Spark action. From now on it is not necessary to put a hive-site.xml manually onto Oozie Spark's sharelib. This issue is resolved.
CDPD-21870: Fix Oozie client always using the current system username instead the one specified by the user. For example, through kerberos or explicit basic authentication.
A bug in Oozie CLI caused the Workflow to be launched in the name of the current Unix user even if Kerberos authentication was used with a ticket for a different user. This issue is resolved.
CDPD-23141: No Sqoop logs present in Oozie Sqoop action launcher logs.
Sqoop logs are not present in the aggregated Yarn logs. This issue is now resolved.
CDPD-20002: When stopped, SSH action should stop the spawned processes on target Host if specified.
When an SSH action was killed the child processes launched by the actions were not killed. The default behaviour is still these not getting killed but we introduced 2 ways to do so:
  • Use the new 0.3 schema version for your SSH action in your workflow.xml and add the "terminate-subprocesses" XML element with value "true". Example: <terminate-subprocesses>true</terminate-subprocesses>
  • You can set this globally by adding the following oozie-site.xml safety-valve in Cloudera Manager with value "true" : "oozie.action.ssh.action.terminate.subprocesses"

If both are set then the value set in the workflow.xml takes precedence. This issue is now resolved.

CDPD-20984: Make Oozie backward compatible with the Falcon Oozie EL extension library.
The falcon-oozie-el-extension library written for HDP 2.6 was not compatible with CDP 7.x Oozie. We introduced a change in Oozie to make that library forward compatible with Oozie in CDP 7. NOTE: The rest of the HDP 2.6 Falcon library is still not compatible with CDP but only falcon-oozie-el-extension is. This issue is now resolved.
CDPD-22161: Yarn application should be submitted on code level with a user running the Workflow to avoid asking for a delegation from IdBroker in the name of Oozie.
When the Yarn remote log dir was set to s3 or abfs, log aggregation for Oozie actions was not working by default. The workaround for this was to extend the IdBroker mapping and add the oozie user there, or to add an explicit file-system credential (which pointed to the Yarn remote log dir) to the Oozie Workflow. These workarounds are no longer required as from now on a delegation token for the remote log dir will be obtained from IdBroker in the name of the user who is running the Workflow.. This issue is now resolved.
CDPD-11965: Cookie without HttpOnly and Secure flag set.
The Secure and HttpOnly attributes are now set on all Cookies returned by Oozie as per recommendations. This issue is now resolved.
CDPD-19281: Missing CSP, X-XSS-Protection, HSTS Headers.

Oozie was enhanced with extra HTTP Headers to make it more secure. In scope of these enhancements the following HTTP Headers are now returned by Oozie: X-XSS-Protection with value "1; mode=block" ; Content-Security-Policy with value "default-src 'self'" ; Strict-Transport-Security with value "max-age=31536000; includeSubDomains".

You can remove these Headers by adding an oozie-site.xml safety-valve with an empty value - should be a simple space - in Cloudera Manager with the "oozie.servlets.response.header." prefix. Example: "oozie.servlets.response.header.Strict-Transport-Security= "

You can also modify the value of these Header the same way through a safety-valve. Example: "oozie.servlets.response.header.Strict-Transport-Security=max-age=604800; includeSubDomains"

Using the same prefix you can also make Oozie return custom HTTP Headers. Example: "oozie.servlets.response.header.MyHeader=MyValue".

These were originally decommissioned when Oozie was rebased from 4.x to 5.x, but to reduce the migration effort for users these are supported again.

This issue is now resolved.

CDPD-19473: CVE-2020-35451 - Fix privilege escalation vulnerability in OozieSharelibCLI.
The security vulnerability was in Oozie's sharelib CLI regarding the temporary directory. This issue is now resolved.
CDPD-20649: Revise yarn.app and mapreduce property overrides in Oozie.
When upgrading from CDH 5 or HDP 2/3 Oozie will still be able to handle the following map-reduce related properties:
  • oozie.launcher.mapreduce.map.memory.mb
  • oozie.launcher.mapreduce.map.cpu.vcores
  • oozie.launcher.mapreduce.map.java.opts

These were originally decommissioned when Oozie was rebased from 4.x to 5.x, but to reduce the migration effort for users these are supported again.This issue is now resolved.

CDPD-21031: Workflow and coordinator action status remains RUNNING after rerun.
This issue is now resolved.
CDPD-18931: No appropriate protocol" error with email action(disable TLS1.0/1.1).
Oozie security enhancements. This issue is now resolved.
CDPD-17598: Log library issues.
The available logging libraries for Hive 2, Spark, and Sqoop actions are adjusted. This issue is now resolved.
CDPD-18703: The Oozie version returns incorrect values.
The "oozie version" command now returns the correct Oozie version and build time. This issue is now resolved.
CDPD-17306: Hive-Common is added as a dependency to Sqoop and Oozie's Sqoop sharelib so that you do not have to do it manually.
This issue is now resolved.
CDPD-17843: Hive-JDBC is added as a dependency to Sqoop and Oozie's Sqoop sharelib so you do not have to do it manually.
This issue is now resolved.
OPSAPS-58298: Oozie must accept a keyStoreType and trustStoreType property in oozie-site.xml.
This issue is now resolved.
CDPD-14964: When a Java action called System.exit() that resulted in a misleading security exception for Sqoop actions, there was an error because of the misleading exception in the yarn logs even though the Workflow is successful.
This issue is now resolved.
CDPD-15735: Oozie Spark actions are failing because Spark and Kafka are using different Scala versions.
This issue is now resolved.
OPSAPS-57429: Zookeeper SSL/TLS support for Oozie.
When SSL is enabled in Zookeeper, Oozie tries to connect to Zookeeper using SSL instead of a non-secure connection.
CDPD-14600
Apache ActiveMQ is updated to address CVE-2016-3088
CDPD-13702
The PostgreSQL driver is upgraded to address CVE-2020-13692
CDPD-11967
Fix to address CWE-693: Protection Mechanism Failure
CDPD-12742: Oozie was not able to communicate with ID Broker and hence it failed to obtain a delegation token, because of a missing Jar
That Jar is now deployed together with Oozie and hence the underlying issue is fixed.
CDPD-12283: By Oozie did not allow to use s3a and abfs file systems and users had to manually specify the supportability of these via Safety Valve
Since Oozie is compatible with these filesystems we changed the default Oozie configuration to allow these so users dont have to manually specify it.
CDPD-10746: Fix to address CVE-2019-17571
CDPD-9895: Various errors when trying to use an S3 filesystem
Oozie is now fully compatible with S3.
CDPD-9761: There is a sub workflow run in independent mode that runs a fork action which contains two (or more) actions
These actions inside the fork action run in parallel mode, and they have some seconds delay in between them. If a parameter is passed to one of these actions, that cannot be resolved, then it changes its status to FAILED, and also the workflows state to FAILED. The other actions state which are not started yet will stuck in PREP state forever. The correct behaviour would be to stop the remaining actions as well as the workflow. Note: this bug only occurs when it is run in independent mode. If it has a parent workflow, then the parent workflow will stop this workflow after 10 minutes because of the callback process.
CDPD-9721: Upgrade built-in spark-hive in Oozie
Oozie is using the Spark-Hive library from the stack.
CDPD-9220: Oozie spark actions using --keytab fail due to duplicate dist. cache
Oozie spark actions add everything in the distributed cache of the launcher job to the distributed cache of the spark job, meaning the keytab is already there, then the --keytab argument tries to add it again causing the failure.
CDPD-9189: Apache Pig support was completely removed from Oozie
CDPD-7108: In case we have a workflow which has, lets say, 80 actions after each other, then the validator code "never" finishes
CDPD-7107: The following were added to the spark opts section of the spark action: --conf spark
CDPD-7106: query tag is not functional for Hive2 action node in oozie
Workflow is intended to create a hive table using Hive2 action node. Though workflow run successfully, table is not created.
CDPD-7105: Oozie workflow processing becomes slow after the increase of rows in WF_JOBS and WF_ACTIONS tables when running against SQL Server
CDPD-6877: When you create a MapReduce action which then creates more than 120 counters, an exception was thrown
CDPD-6630: Oozie by default gathers delegation tokens for the nodes defined in MapReduce
CDPD-5168: Logging enhancements in CoordElFunctions for better supportability
CDPD-4826: Oozies web server does not work when TLS is enabled and Open JDK 11 is in use
This issue is now fixed.

Apache patch information

  • OOZIE-3365
  • OOZIE-3596
  • OOZIE-3409