Cloudera Runtime includes the Sqoop Client for bulk importing and exporting data from
diverse data sources to Hive. You learn how to install the RDBMS connector and Sqoop Client in
CDP.
-
In Cloudera Manager, in Clusters, select Add Service from the options menu.
-
Select the Sqoop Client and click Continue.
-
Choose a JDBC database driver, depending on the data source of the source or
destination for a Sqoop import or export, respectively.
-
Install the JDBC database driver in
/var/lib/sqoop
on the Sqoop
node.
Do not install the/opt/cloudera/parcels/CDH
because upgrades modify
this directory.
- MySQL: Download the MySQL driver
https://dev.mysql.com/downloads/connector/j/
to /var/lib/sqoop, and then run tar -xvzf
mysql-connector-java-<version>.tar.gz
.
- Oracle: Download the driver from
https://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html
and put it in /var/lib/sqoop
.
- PostgreSQL: Download the driver from
http://jdbc.postgresql.org/download.html
and put it in
/var/lib/sqoop
.
-
In Cloudera Manager, click
.
After setting up the Sqoop client, you can enter Sqoop commands using the following connection string, depending
on your data source.
-
MySQL Syntax:
jdbc:mysql://<HOST>:<PORT>/<DATABASE_NAME>
Example:
jdbc:mysql://my_mysql_server_hostname:3306/my_database_name
-
Oracle Syntax:
jdbc:oracle:thin:@<HOST>:<PORT>:<DATABASE_NAME>
Example:
jdbc:oracle:thin:@my_oracle_server_hostname:1521:my_database_name
PostgreSQL Syntax:
jdbc:postgresql://<HOST>:<PORT>/<DATABASE_NAME>
Example:
jdbc:postgresql://my_postgres_server_hostname:5432/my_database_name
Netezza
Syntax:
jdbc:netezza://<HOST>:<PORT>/<DATABASE_NAME>
Example:jdbc:netezza://my_netezza_server_hostname:5480/my_database_name
-
Teradata Syntax:
jdbc:teradata://<HOST>/DBS_PORT=1025/DATABASE=<DATABASE_NAME>
Example:
jdbc:teradata://my_teradata_server_hostname/DBS_PORT=1025/DATABASE=my_database_name