Copy Files
All required configuration files for the workflows you intend to create must be installed prior to running the workflow job. For this ETL use case example, you need the MySQL JAR file, and the configuration files for Hive and Tez.
Prerequisites
Ensure that you have the latest supported MySQL version for your environment, according to the Support Matrices.
In this example workflow, the MySQL driver JAR file is shared across the cluster, rather than identifying the file in each workflow. So the file must be copied to a shared location.
Steps
Login to the HDFS server as the Oozie user. For example:
su - oozie
Identify and copy the name of the timestamped subdirectory of the Oozie
/share/lib
directory.oozie admin -shareliblist oozie admin -sharelibupdate
The output of the -sharelibupdate command shows the
lib_$TIMESTAMP
directory. You use the timestamp directory name in the following steps.Copy the
mysql-connector*.jar
file to the Sqooplib
directory so Sqoop can access MySQL.Example:
hdfs dfs -put /$PATH/mysql-connector-java-5.1.37.jar /user/oozie/share/lib/lib_$TIMESTAMP/sqoop
Check the Support Matrices for latest supported MySQL version for your environment.
Copy the configuration files
hive-site.xml
andtez-site.xml
to the Hivelib
directory and rename them.Example:
hdfs dfs -put /$PATH/hive-site.xml /user/oozie/share/lib/lib_$TIMESTAMP/hive/hive-conf.xml hdfs dfs -put /$PATH/hive-site.xml /user/oozie/share/lib/lib_$TIMESTAMP/tez/tez-conf.xml
Important There can be only one configuration file named $COMPONENT-site.xml in a
/lib
directory in HDFS for each Apache component. Therefore, you must either rename any copied $COMPONENT-site.xml file or put it in a directory other than a/lib
directory.Update the server to use the newer version of the
/share/lib
directory.oozie admin -sharelibupdate
More Information