Sqoop Action Parameters

You can use the Apache Sqoop action to move structured data between Apache Hadoop and relational databases. You can import data into files in a specified location in your Hadoop cluster. You can also use Sqoop to extract data from Hadoop and export it to relational databases outside of Hadoop. Sqoop works with several databases, including Teradata, Netezza, Oracle, MySQL, and Postgres.

Table 7.9. Sqoop Action, General Parameters

Parameter Name	Description	Additional Information	Example
Send As	Options are Command and Args (arguments).
Command	You can enter a valid Sqoop command.	See the Apache Sqoop documentation for available commands.	import --connect jdbc:mysql://wfm.openstacklocal/test --username centos --password-file /user/centos/.password --table wfm --split-by rowkey --hive-import -m 1
Args	You can enter one or more valid Sqoop arguments.	See the Apache Sqoop documentation for available arguments.	--connect jdbc:mysql://wfm.openstacklocal/test --username
Job XML	You can select one or more job.xml files to pass Sqoop configuration details.	The configuration file that specifies the variables used for the Sqoop action in the workflow. Can be overwritten or replaced by entries under the Configuration section.

Table 7.10. Sqoop Action, Transition Parameters

Parameter Name	Description	Additional Information	Default Setting
Error To	Indicates what action to take if the action errors out.	You can modify this setting in the dialog box or by modifying the workflow graph.	Defaults to kill node, but can be changed.
OK To	Indicates what node to transition to if the action succeeds.	You can modify this setting in the dialog box or by modifying the workflow graph.	Defaults to the next node in the workflow.

Table 7.11. Sqoop Action, Advanced Properties Parameters

Parameter Name	Description	Additional Information	Example
Resource Manager	Master node that arbitrates all the available cluster resources among the competing applications.	The default setting is discovered from the cluster configuration.	${resourceManager}
Name Node	Manages the file system metadata.	Keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. Clients contact NameNode for file metadata or file modifications.	${nameNode}
File	Select any files that you want to make available to the Sqoop action when the workflow runs.		/user/centos/oozie/apps/sqoop/lib/hive-site.xml
Archive	Select any archives that you want to make available to the Sqoop action when the workflow runs.		archived data files
Prepare	Select mkdir or delete and identify any HDFS paths to create or delete before starting the job.	Use delete to do file cleanup prior to job execution. Enables Oozie to retry a job if there is a transient failure (the job output directory must not exist prior to job start). If the path is to a directory: delete deletes all content recursively and then deletes the directory. mkdir creates all missing directories in the path.
Arg	Identify any arguments to be passed to Sqoop.

Table 7.12. Sqoop Action, Configuration Parameters

Parameter Name	Description	Additional Information	Example
Name and Value	The name/value pair can be used instead of a job.xml file or can override parameters set in the job.xml file.	Used to specify formal parameters. If the name and value are specified, the user can override the values from the Submit dialog box. Can be parameterized (templatized) using EL expressions. See the Apache Sqoop documentation for more information.

Parameter Name

Description

Additional Information

Example

Name and Value

The name/value pair can be used instead of a job.xml file or can override parameters set in the job.xml file.

Used to specify formal parameters. If the name and value are specified, the user can override the values from the Submit dialog box. Can be parameterized (templatized) using EL expressions.

See the Apache Sqoop documentation for more information.

​Sqoop Action Parameters

Sqoop Action Parameters