Creating a remote database using data connector

Learn how to create a Hive data connector, which you can then use to create and map a remote database to Hive. The remote database can reside in an external data source, such as MySQL, PostGreSQL, Oracle, Redshift, Derby, or other HMS instances.

You must ensure that the hive.security.temporary.authorization.for.data.connectors property is set to "true".

In the Hive virtual warehouse details page, go to CONFIGURATIONS > Hiveserver2 > hive-site configuration file and click to add this property as a custom configuration.

  1. Create a data connector using the following syntax:
    CREATE CONNECTOR [IF NOT EXISTS] connector_name
        [TYPE datasource_type]
        [URL datasource_url]
        [COMMENT connector_comment]
        [WITH DC PROPERTIES (property_name=property_value, ...)];
    If you are using a cleartext password:
    CREATE CONNECTOR postgres_local 
        TYPE 'postgres' 
        URL 'jdbc:postgresql://localhost:5432' 
        WITH DC PROPERTIES (
            "hive.sql.dbcp.username"="postgres", 
            "hive.sql.dbcp.password"="postgres"
        );
    If you are using a Java keystore instead of a cleartext password:
    CREATE CONNECTOR postgres_local_ks 
        TYPE 'postgres' 
        URL 'jdbc:postgresql://localhost:5432' 
        WITH DC PROPERTIES (
            "hive.sql.dbcp.username"="postgres",
            "hive.sql.dbcp.password.keystore"="jceks://app/local/hive/secrets.jceks",
            "hive.sql.dbcp.password.key"="postgres.credential"
        );
    
  2. Create a remote database in Hive using the data connector that you created in the previous step.
    CREATE [REMOTE] (DATABASE) [IF NOT EXISTS] database_name
        [COMMENT database_comment]
        [USING connector_name]
        [WITH DBPROPERTIES (property_name=property_value, ...)];
    CREATE REMOTE DATABASE postgres_hive_test 
        USING postgres_local 
        WITH DBPROPERTIES ("connector.remoteDbName"="remote_db_test");

    This statement maps a remote database named "remote_db_test" to a Hive database named "postgres_hive_test" in Hive.

  3. You can now use the tables that are available in the remote database.
    USE remote_db_test;
    
    SHOW TABLES;
    
    DESCRIBE [formatted] <table name>;
    
    SELECT <col1> FROM <table name> WHERE <filter1> AND <filter2>;