Using Cloudera Connector for Netezza
After you have installed the Cloudera Connector for Netezza and copied the required JDBC driver for Netezza to the lib directory of the Sqoop installation, use the connector by invoking the Sqoop tools with the appropriate connection string.
The connection string must be of the form jdbc:netezza://<nz-host>/<nz-instance>, where:
- <nz-host> is the hostname of the machine where the Netezza server runs.
- <nz-instance> is the Netezza database instance name.
To use the Netezza connector, you must specify the --direct option along with a number of mappers greater than one.
For example, the following command invokes the Sqoop import tool with eight mappers and uses the Cloudera Connector for Netezza:
$ sqoop import --connect jdbc:netezza://localhost/MYDB --username arvind \ --password xxxxx --direct --table MY_TABLE --num-mappers 8 --escaped-by '\\' \ --fields-terminated-by ','
The following command invokes the Sqoop export tool with eight mappers and uses the Cloudera Connector for Netezza:
$ sqoop export --connect jdbc:netezza://localhost/MYDB --username arvind \ --password xxxxx --direct --export-dir /user/arvind/MY_TABLE --table MY_TABLE_TARGET \ --num-mappers 8 --input-escaped-by '\\'
The direct mode Netezza connector supports the following Netezza-specific arguments for imports and exports:
- --nz-maxerrors <n>: Specifies the number of error records needed to abort an import or export operation. By default, this is set to 1, which implies that the operation fails on the first bad record encountered. Setting it to a higher value, for example 3, configures Sqoop to continue despite two bad records but abort on the third. If you set the value to 0, the operation never aborts due to bad records.
- --nz-logdir <path>: Specifies the location of a directory on the local filesystem where Sqoop places the Netezza transport specific nzbad and nzlog files. Use these files to debug and tune the overall operation for the most effective usage. This path is local to the hosts on which the Sqoop map jobs run and does not apply to the system from which Sqoop is launched. If the directory corresponding to this path does not exist, Sqoop attempts to create it before initiating the transport.
- --nz-ctrlchars: Instructs Sqoop to use the CTRLCHARS parameter when exchanging data with Netezza. This parameter allows processing of data that has ASCII characters of value 31 and less.
- --nz-uploaddir: Generated Netezza logs are stored in the HDFS directory specified for this parameter.
- --schema: Defines the schema Sqoop uses for both import and export. The schema is used for metadata queries, and for record extraction and loading.
$ sqoop export --connect jdbc:netezza://localhost/MYDB --username arvind \ --password xxxxx --direct --export-dir /user/arvind/MY_TABLE --table MY_TABLE_TARGET \ --num-mappers 8 --input-escaped-by '\\' -- --nz-maxerrors 0