Import RDBMS data into Hive

You can test the Apache Sqoop import command and then execute the command to import relational database tables into Hive.

You enter the Sqoop import command on the command line of your Hive cluster to import data from a data source to Hive. You can test the import statement before actually executing it.
  • In Cloudera Manager, the Apache Sqoop client is running.
  • In Cloudera Manager, the Hive Metastore, HiveServer, and Hive services are running.
  1. Optionally, test the import command before execution using the eval option.
    sqoop eval --connect jdbc:mysql://db.foo.com/bar \
    --query "SELECT * FROM employees LIMIT 10"
    The output of the select statement appears listing 10 rows of data from the RDBMS employees table.
  2. Execute a Sqoop import command that specifies the Sqoop connection to the RDBMS, the data you want to import, and the destination Hive table name.
    This command imports the MySQL EMPLOYEES table to a new Hive table named in the warehouse.
    sqoop import --connect jdbc:mysql://db.foo.com:3306/corp \
    --table EMPLOYEES \
    --hive-import \
    --create-hive-table \
    --hive-table mydb.newtable