Workflow Management
Also available as:
PDF
loading table of contents...

Copy Files

All required configuration files for the workflows you intend to create must be installed prior to running the workflow job. For this ETL use case example, you need the MySQL JAR file, and the configuration files for Hive and Tez.

In this example workflow, the MySQL driver JAR file is shared across the cluster, rather than identifying the file in each workflow. So the file must be copied to a shared location.

Steps

  1. Identify and copy the name of the timestamped subdirectory of the Oozie /share/lib directory.

    oozie admin -shareliblist
    oozie admin -sharelibupdate

    The output of the -sharelibupdate command shows the lib_$TIMESTAMP directory. You use the timestamp directory name in the following steps.

    You use the timestamp directory name in the following steps.

  2. Copy the mysql-connector*.jar file to the Sqoop lib directory so Sqoop can access MySQL.

    Example:

    hdfs dfs -put /$PATH/mysql-connector-java-5.1.37.jar /user/oozie/share/lib/lib_$TIMESTAMP/sqoop
  3. Copy the configuration files hive-site.xml and tez-site.xml to the Hive lib directory and rename them.

    Example:

    hdfs dfs -put /$PATH/hive-site.xml /user/oozie/share/lib/lib_$TIMESTAMP/hive/hive-conf.xml
    hdfs dfs -put /$PATH/hive-site.xml /user/oozie/share/lib/lib_$TIMESTAMP/tez/tez-conf.xml
    [Important]Important

    There can be only one configuration file named $COMPONENT-site.xml in a /lib directory in HDFS for each Apache component. Therefore, you must either rename any copied $COMPONENT-site.xml file or put it in a directory other than a /lib directory.

  4. Update the server to use the newer version of the /share/lib directory.

    oozie admin -sharelibupdate