Fixed Issues in Apache Oozie
Review the list of Oozie issues that are resolved in Cloudera Runtime 7.1.9.
- CDPD-27164: Oozie should not rely on its LoadBalancer internally
- Oozie will no longer use the LoadBalancer to issue a callback notification, but instead it will try all available Oozie instances one-by-one. If the callback succeeded against one of the Oozie instances, then we will not try the other ones. This way the LoadBalancer will not be used for such purposes.
- CDPD-58538: Oozie should upload and use the config files from sqoop-conf/managers.d when available
- Previously, Oozie did not honor Sqoop's managers.d configurations and extra connector Jars from the lib folder, but now both are automatically available in Oozie's Sqoop action, allowing users to seamlessly utilize connectors like the Sqoop Teradata connector without the need for manual configuration updates or copying Jars to the Workflow's lib folder
- CDPD-50296: Improve Oozie's app state action checking
- Enhanced Oozie's action state checking, to immediately query for running applications right after start-up
- CDPD-41425: LAST_ONLY and NONE execution modes
- Possible OutOfMemoryError when there are too many coordinator actions to materialize.
- If there is a coordinator job defined with a frequency by the minute (e.g. frequency="* * * * *"), and start-time lies well in the past, and the coordinator job's execution-mode is LAST_ONLY or NONE, it can happen that too many CoordinatorActionBean instances are kept on JVM heap within CoordMaterializeTransitionXCommand#insertList as those execution modes omit the check for the throttle value.
- As a consequence, we can see as many as multiple hundred
thousands of log entries trying to increase
[user@host ~]$ grep 'In storeToDB() coord action id' /var/log/oozie/oozie-HOSTNAME.log.out | wc -l478408
Apache Jira: https://issues.apache.org/jira/browse/OOZIE-3254
- CDPD-43192: Extend Oozie Spark sharelib for HBase interaction
- An additional HBase Jars is added to sharelib to support proper HBase interaction.
- CDPD-43343: Oozie log streaming bug when log timestamps are the same on multiple Oozie servers
- Fixed a bug in the mechanism of the Oozie log streaming.
- In case there is a log message in server "A" with the same timestamp as an other log message in server "B", then according to the current implementation, the logs acquired by using `TimestampedMessageParser` corresponding to server "B" will be overwritten by server "A" 's parser (due to the operation of timestampMap.put(earliestParser.getLastTimestamp(), earliestParser)), therefore causing the log messages from server "B" to be ignored from that point.
- CDPD-44209: SqoopMain's printArgs masks Sqoop command line option if preceding one contains "password"
- In Yarn, there was a previous issue in Oozie where command-line arguments were masked incorrectly due to mistaken password detection. As a resolution, customers now have the option to utilize the "oozie.launcher.argumentMaskingExceptionList" configuration. This feature allows them to specify exceptions for password masking. For detailed information on how to use this configuration, please refer to the documentation in oozie-default.xml.
- CDPD-46049: SSH action fails when 'oozie.action.ssh.http.command.post.options' property contains double quotes
- The SSH action's callback mechanism failed with "Invalid content-type" error when capture-output was used in the action definition.
- CDPD-47821: Add missing Sqoop Atlas notification jars to Sqoop share lib
- Earlier, Atlas notification was nonfunctional in Oozie's Sqoop action due to missing Jars, but with the inclusion of those Jars in Oozie's Sqoop ShareLib, Atlas notifications are now expected to function correctly in Oozie's Sqoop action.
- CDPD-56936: Oozie's db cli tool does not honor custom connection properties
- The Oozie DB CLI tool did not respect the "ConnectionProperties" property set by the user through the "oozie.service.JPAService.connection.properties" configuration in Oozie.
- OPSAPS-64457: Make CM provide Oozie the necessary configuration regarding CDPD-43396
- HBase service and Sqoop client dependencies were added for Ooize to have access to their configurations.
- OPSAPS-63816: Configure service hosts to Oozie
- Cloudera Manager will provide the address of all Oozie server instances as a configuration to all Oozie instances. This will be then used by Oozie's callback mechanism so that instead of making the callback through the LoadBalancer in HA mode, the callback will be attempted through each Oozie instance, and if one of them succeeds, then we stop. This way we'll no longer use the LoadBalancer, and make the callback mechanism safer by not having a middle-man.
- OPSAPS-67346: [oozie] Implement validator in CM for Oozie-Spark3 integration
- A validator was added which checks that there is a Spark3 role on all Oozie node. If there is any missing Spark3 role then a warning message will be visible on Oozie's CM page listing the nodes.
Apache patch information