Hive catalog
You can add Hive as a catalog in Flink SQL by adding Hive dependency to your project, registering the Hive table in Java and setting it either globally in Cloudera Manager or the custom environment file.
The Hive catalog serves two purposes:
- It is a persistent storage for pure Flink metadata
- It is an interface for reading and writing existing Hive tables
Maven
Dependency
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-hive_2.11</artifactId>
<version>1.10.0-csa1.2.0.0</version>
</dependency>
The following example shows how to register and use the Hive catalog from
Java:
String HIVE = "hive";
String DB = "default";
String HIVE_CONF_DIR = "/etc/hive/conf";
String HIVE_VERSION = "3.1.3000";
HiveCatalog catalog = new HiveCatalog(HIVE, DB, HIVE_CONF_DIR, HIVE_VERSION);
tableEnv.registerCatalog(HIVE, catalog);
tableEnv.useCatalog(HIVE);
To use the Hive Catalog from the SQL client, you can enable it either globally from Cloudera Manager or use custom environment settings as YAML configuration files.
To enable Hive Catalog in Cloudera Manager:
- Log in to Cloudera Manager
- Go to .
- Enable Hive Service.
- Enable Hive Catalog for SQL Client.
To enable Hive Catalog per user, add the following snippet in the custom environment file to
the catalogs
section:
...
catalogs:
- name: hive
type: hive
hive-conf-dir: /etc/hive/conf
hive-version: 3.1.3000
...
Launch the
flink-sql-client
and test the Hive Catalog with the following
commands:Flink SQL> show catalogs;
default_catalog
hive
Flink SQL> use catalog hive;
Flink SQL> show tables;