Cloudera Runtime includes the Sqoop Client for bulk importing and exporting data from
diverse data sources to Hive.
-
In Cloudera Manager, in Clusters, select Add Service from the options menu.
-
Select the Sqoop Client and click Continue.
-
Choose a JDBC database driver, depending on the data source of the source or
destination for a Sqoop import or export, respectively.
-
Install the JDBC database driver in
/var/lib/sqoop
on the Sqoop
node.
Do not install the/opt/cloudera/parcels/CDH
because upgrades modify
this directory.
- MySQL: Download the driver from
http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.48.tar.gz
to /var/lib/sqoop, and then run tar -xvzf
mysql-connector-java-5.1.48.tar.gz
.
- Oracle: Download the driver from
https://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html
and put it in /var/lib/sqoop
.
- PostgreSQL: Download the driver from
http://jdbc.postgresql.org/download.html
and put it in
/var/lib/sqoop
.
-
In Cloudera Manager, click
.
After setting up the Sqoop client, you can enter Sqoop commands using the following connection string, depending
on your data source.
-
MySQL Syntax:
jdbc:mysql://<HOST>:<PORT>/<DATABASE_NAME>
Example:
jdbc:mysql://my_mysql_server_hostname:3306/my_database_name
-
Oracle Syntax:
jdbc:oracle:thin:@<HOST>:<PORT>:<DATABASE_NAME>
Example:
jdbc:oracle:thin:@my_oracle_server_hostname:1521:my_database_name
PostgreSQL Syntax:
jdbc:postgresql://<HOST>:<PORT>/<DATABASE_NAME>
Example:
jdbc:postgresql://my_postgres_server_hostname:5432/my_database_name
Netezza
Syntax:
jdbc:netezza://<HOST>:<PORT>/<DATABASE_NAME>
Example:jdbc:netezza://my_netezza_server_hostname:5480/my_database_name
-
Teradata Syntax:
jdbc:teradata://<HOST>/DBS_PORT=1025/DATABASE=<DATABASE_NAME>
Example:
jdbc:teradata://my_teradata_server_hostname/DBS_PORT=1025/DATABASE=my_database_name