Apache Sqoop Changes
After upgrading from CDH to CDP, Sqoop action errors are not logged due to an change in the log4j configuration. You must configure Oozie to log Sqoop action errors in the Oozie launcher log.
Before Upgrade to CDP
The Sqoop action in an Oozie workflow log errors to the Oozie launcher log.
For example:
>>> Invoking Sqoop command line now >>>
2021-01-11 09:58:21,438 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2021-01-11 09:58:21,489 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.7.7.1.5.0-257
2021-01-11 09:58:21,503 [main] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
2021-01-11 09:58:21,516 [main] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
...
After Upgrade to CDP
The Sqoop action in an Oozie workflow does not log errors to the Oozie
launcher log. For example:
>>> Invoking Sqoop command line now >>>
09:39:49.715 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.jar is deprecated. Instead, use mapreduce.job.jar
09:39:49.738 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
09:39:49.974 [main] INFO org.apache.hadoop.mapreduce.JobResourceUploader - Disabling Erasure Coding for path: /user/cloudera/.staging/job_1609912545960_0013
09:39:50.347 [main] INFO org.apache.hadoop.mapreduce.JobSubmitter - Cleaning up the staging area /user/cloudera/.staging/job_1609912545960_0013
<<< Invocation of Sqoop command completed <<<
No child hadoop job is executed.
Intercepting System.exit(1)
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
...
Action Required
Configure the Sqoop action to use the log4j1 configuration for this Sqoop action and the Hue workspace only.
Check Parquet writer implementation property
You might need to set the Parquet writer implementation property to
hadoop
. In releases before CDP 7.1.6/Cloudera Manager 7.3.1, the
Parquet writer implementation property (parquetjob.configurator.implementation) default
was not
hadoop
.
hadoop
.
Making this change prevents encountering the following error when using the Sqoop client:
Post upgrade sqoop failed with the error "Invalid Parquet job configurator implementation is set: kite. Supported values are: [HADOOP]" ,