Fixed issues in Oozie

This section lists the issues that have been fixed since the previous version.

CDPD-7357: Upgrade to Guava 28.1 to avoid CVE-2018-10237
Oozie has been upgraded to use Guava version 28.1 to avoid CVE-2018-10237.
CDPD-7526: Oozie - Upgrade to Jetty 9.4.26 to avoid CVEs
Oozie now uses Jetty 9.4.26, which addresses the following CVEs: CVE‑2017‑7656, CVE‑2017‑7657,CVE‑2017‑7658, CVE‑2018‑12536, CVE‑2017‑9735, and CVE‑2019‑10247.
CDPD-10746: Update log4j to address CVE-2019-17571
CDPD-9761: https://issues.apache.org/jira/browse/OOZIE-3584: Fork-join action issue arises when action parameter is not resolved.
There is a sub workflow run in the independent mode that runs a fork action which contains two or more actions. These actions in the fork action run in parallel mode and have a few seconds of delay in between them. If a parameter is passed to one of these actions that cannot be resolved, then it changes its status to FAILED and also the workflow state to FAILED. The other actions state which are not yet started will get stuck in PREP state forever. The correct behavior is to KILL the remaining actions as well as the workflow. This issue occurs only when you run this in the independent mode. If it has a parent workflow, then the parent workflow will kill this workflow after 10 minutes because of the callback process.
CDPD-9721: Upgrade built-in spark-hive in Oozie
Oozie uses the spark-hive library from the stack.
CDPD-9220: https://issues.apache.org/jira/browse/OOZIE-3586: Oozie spark actions using --keytab fail due to duplicate distributed cache
In CDH 6.x, on the Hadoop 3 rebase, it is now a failure if items are added multiple times to the distributed cache. In CDH 5, this was a warning. This is not an issue for most users, as adding multiple times typically is user error, but this completely breaks Spark actions with keytabs (--keytab). Oozie spark actions add everything in the distributed cache of the launcher job to the distributed cache of the spark job, meaning the keytab is already there, then the --keytab argument tries to add it again causing the failure.
CDPD-7108: https://issues.apache.org/jira/browse/OOZIE-3561: Forkjoin validation gets slow when there are more actions in chain.
In a workflow, if there are more actions, for example, 80 actions one after the other, then the validator code never completes.
CDPD-7107: https://issues.apache.org/jira/browse/OOZIE-3551: Configure working defaults for Spark action in Oozie.
The following is added to the spark opts section of the spark action:
  • --conf spark.yarn.security.tokens.hiveserver2.enabled=false
  • --conf spark.yarn.security.tokens.hivestreaming.enabled=false
CDPD-7106: https://issues.apache.org/jira/browse/OOZIE-2828: Query tag is not functional for Hive2 action node in Oozie.
Query tag is not functional for Hive2 action node in oozie. Workflow is intended to create a hive table using a Hive2 action node. Though workflow runs successfully, the table is not created.
CDPD-7105: Oozie workflow processing becomes slow after the increasing the rows in WF_JOBS and WF_ACTIONS tables.
Oozie workflow processing becomes slow after the increase of rows in WF_JOBS and WF_ACTIONS tables when running against SQL Server.
CDPD-6877: https://issues.apache.org/jira/browse/OOZIE-3578: MapReduce counters cannot be used over 120
When you create a Map-Reduce action, then it creates more than 120 counters. This displays an exception.
CDPD-6630: https://issues.apache.org/jira/browse/OOZIE-3575: Add credential support for cloud file systems
Oozie by default gathers delegation tokens for the nodes defined in mapreduce.job.hdfs-servers or oozie.launcher.mapreduce.job.hdfs-servers in case of distcp actions and for the workflow path.
Though this implementation works for HDFS, such implementations are not supported where the job relates resources, which must access runtime and are present on different file systems/buckets and so on.
The following scenarios are addressed: Oozie should obtain delegation token in the following cases:
  • The defaultFS is cloud.
  • The workload.xml is in cloud.
  • Input/output/auxiliary files referred from workflow are in cloud.
  • Newly introduced feature - you can define filesystem credentials for the workflow (as its done with Hive or HCAT). This allows you to handle where Oozie is unable to decide the tokens needed at launch time by default and can also get tokens for different cloud storages and buckets too.
CDPD-5168: https://issues.apache.org/jira/browse/OOZIE-3381: Enhance logging of CoordElFunctions.
Logging enhancements in CoordElFunctions for better supportability.
CDPD-4826: Oozie TLS does not work with OpenJDK 11
Oozies web server does not work when TLS is enabled and Open JDK 11 is in use.