Workflow Management
Also available as:
loading table of contents...

Create the Sqoop Action to Load the Data


  1. In the workflow graph, click the connector between the Start and End nodes, then click the + icon.

  2. Click the Sqoop icon to add another Sqoop action node to the workflow.

    This Sqoop action will be used to load the transformed data to a specified location.

  3. Click the Sqoop node in the workflow graph and rename it using a descriptive name.

    For example, name the node sqoop-load.

    This is necessary because there will be two Sqoop actions in this workflow, and each node in a workflow must have a unique name. Having descriptive node names is also helpful when identifying what a node is intended to do, especially in more complicated workflows.

  4. Click the Sqoop node again and then click the Action Settings gear icon.

  5. In the Sqoop action dialog box, select Command.

  6. In the Command field, enter a command to extract data.

    For example:

    export --connect jdbc:mysql://wfmgr-5.openstacklocal/customer-data --username wfm --password-file /user/wfm/.password
    --table exported --input-fields-terminated-by "\001" --export-dir /usr/output/marketing/customer_id

    The password for user wfm is called from a password file.

  7. In the Advanced Properties section, browse to the directory that contains the Hive and Tez configuration files you copied into a lib directory and add those resources to the File fields.

    For example:



  8. In the Prepare section, select delete, and then browse for or type the path to be deleted.

    Selecting delete ensures that if a job is interrupted prior to completion, any files that were created will be deleted prior to re-executing the job, otherwise the rerun cannot complete.

    You can optionally include the delete option in the Command field.

  9. Use the default settings for the remaining fields and options.

  10. Click Save and close the dialog box.

More Information

Apache Sqoop Action

Apache Sqoop User Guide