Workflow Management
Chapter 6. Sample ETL Use Case

You can use Sqoop and Hive actions in a workflow to perform a common ETL flow: extract data from a relational database using Sqoop, transform the data in a Hive table, and load the data into a data warehouse using Sqoop.

Following is an example of a common simple Sqoop>Hive>Sqoop ETL workflow created in Workflow Manager. In this example, we extract customer data from a MySQL database, select specific data to include in a Hive table, and then load the data into a data warehouse.


To successfully execute the example Sqoop>Hive>Sqoop ETL workflow defined below, the following prerequisites must be met.

  • Apache Hive and Apache Sqoop have been successfully installed and configured.

  • You successfully completed the tasks in "Configuring WorkFlow Manager View" in the Ambari Views guide.

  • All node managers must be able to communicate with the MySQL server.

Workflow Tasks

The sample workflow consists of the following: