DistCp Action Parameters
DistCp is used for copying files from one cluster to another, or copying files within a single cluster.
Table 7.28. DistCp Action, General Parameters
Parameter Name | Description | Additional Information | Example |
---|---|---|---|
Arg | Arguments to the distcp commands. |
Copy a file/directory from source to target: hdfs://nn1:8020/dir/file hdfs://nn2:8020/dir/file Copy file from 2 source directory to the target: hdfs://nn1:8020/dir/a \ hdfs://nn1:8020/dir/b \ hdfs://nn2:8020/dir/dir2 Copy file from source to target by overwriting content in target, if available: -update hdfs://nn1:8020/source/first hdfs://nn1:8020/source/second hdfs://nn2:8020/target | |
Java Opts | Use to set Java options to the DistCp command. | -Xmn256m |
Table 7.29. DistCp Action, Transition Parameters
Parameter Name | Description | Additional Information | Default Setting |
---|---|---|---|
Error To | Indicates what action to take if the action errors out. | You can modify this setting in the dialog box or by modifying the workflow graph. | Defaults to kill node, but can be changed. |
OK To | Indicates what node to transition to if the action succeeds. | You can modify this setting in the dialog box or by modifying the workflow graph. | Defaults to the next node in the workflow. |
Table 7.30. DistCp Action, Advanced Properties Parameters
Parameter Name | Description | Additional Information | Example |
---|---|---|---|
Resource Manager | Master node that arbitrates all the available cluster resources among the competing applications. | The default setting is discovered from the cluster configuration. | ${resourceManager} |
Name Node | Manages the file system metadata. | Keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. Clients contact NameNode for file metadata or file modifications. | ${nameNode} |
Prepare | Select mkdir or delete and identify any HDFS paths to create or delete before starting the job. | Use delete to do file cleanup prior to job execution. Enables Oozie to retry a job if there is a transient failure (the job output directory must not exist prior to job start). If the path is to a directory: delete deletes all content recursively and then deletes the directory. mkdir creates all missing directories in the path. |
Table 7.31. DistCp Action, Configuration Parameters
Parameter Name | Description | Additional Information | Example |
---|---|---|---|
Name and Value | The name/value pair can be used instead of a job.xml file or can override parameters set in the job.xml file. | Used to specify formal parameters. If the name and value are specified, the user can override the values from the Submit dialog box. Can be parameterized (templatized) using EL expressions. |