Using the JdbcStorageHandler, you can connect Apache Hive to a MySQL, PostgreSQL,
Oracle, DB2, or Derby data source. You can then create an external table to represent the
data, and query the table.
This task assumes you are a
Cloudera Private Cloud Base user. You create an external table that uses the JdbcStorageHandler
to connect to and read a local JDBC data source.
-
Load data into a supported SQL database, such as MySQL, on a node in your
cluster, or familiarize yourself with existing data in the your database.
-
Create an external table using the JdbcStorageHandler and table properties that
specify the minimum information: database type, driver, database connection
string, user name and password for querying hive, table name, and number of
active connections to Hive.
CREATE EXTERNAL TABLE mytable_jdbc(
col1 string,
col2 int,
col3 double
)
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
TBLPROPERTIES (
"hive.sql.database.type" = "MYSQL",
"hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver",
"hive.sql.jdbc.url" = "jdbc:mysql://localhost/sample",
"hive.sql.dbcp.username" = "hive",
"hive.sql.dbcp.password" = "hive",
"hive.sql.table" = "MYTABLE",
"hive.sql.dbcp.maxActive" = "1"
);
-
Query the external table.
SELECT * FROM mytable_jdbc WHERE col2 = 19;