Copy Files
All required configuration files for the workflows you intend to create must be installed prior to running the workflow job. For this ETL use case example, you need the MySQL JAR file, and the configuration files for Hive and Tez.
In this example workflow, the MySQL driver JAR file is shared across the cluster, rather than identifying the file in each workflow. So the file must be copied to a shared location.
Steps
Identify and copy the name of the timestamped subdirectory of the Oozie
/share/lib
directory.oozie admin -shareliblist oozie admin -sharelibupdate
The output of the -sharelibupdate command shows the
lib_$TIMESTAMP
directory. You use the timestamp directory name in the following steps.You use the timestamp directory name in the following steps.
Copy the
mysql-connector*.jar
file to the Sqooplib
directory so Sqoop can access MySQL.Example:
hdfs dfs -put /$PATH/mysql-connector-java-5.1.37.jar /user/oozie/share/lib/lib_$TIMESTAMP/sqoop
Copy the configuration files
hive-site.xml
andtez-site.xml
to the Hivelib
directory and rename them.Example:
hdfs dfs -put /$PATH/hive-site.xml /user/oozie/share/lib/lib_$TIMESTAMP/hive/hive-conf.xml hdfs dfs -put /$PATH/hive-site.xml /user/oozie/share/lib/lib_$TIMESTAMP/tez/tez-conf.xml
Important There can be only one configuration file named $COMPONENT-site.xml in a
/lib
directory in HDFS for each Apache component. Therefore, you must either rename any copied $COMPONENT-site.xml file or put it in a directory other than a/lib
directory.Update the server to use the newer version of the
/share/lib
directory.oozie admin -sharelibupdate