To determine the Impala configurations required for creating an Airflow connection
for Cloudera Base on premises Impala, perform the
following steps:
Go to Cloudera Manager > IMPALA > Instances.
On the Instances page, identify the
hostname that has Impala Daemon assigned as the
Role Type and copy it to a text editor. If multiple
hostnames exist with Impala Daemon as Role
Type, then copy any one of them and paste it to the text
editor.
Go to the Configurations tab.
Copy the following configuration values and paste
them to the text editor along with the hostname copied earlier:
Port that is the port number mentioned in the
Impala Daemon HiveServer2 HTTP Port
field
use_ssl that is True if the
client_services_ssl_enabled field is enabled.
Otherwise, it is False.
kerberos_service_name that is the value mentioned in
the kerberos_princ_name field
To determine the Hive configurations required for creating an Airflow connection for
Cloudera Base on premises Hive, perform the
following steps:
Go to Cloudera Manager > HIVE_ON_TEZ > Instances.
On the Instances page, identify the
hostname that has HiveServer2 assigned as the
Role Type and copy it to a text editor. If multiple
hostnames exist with HiveServer2 as Role
Type, then copy any one of them and paste it to the text
editor.
Go to Configurations tab.
Copy the following configuration values and paste them
to the text editor along with the hostname copied earlier:
Port that is the port number mentioned in the
HiveServer2 Port field
use_ssl that is True if the
hive.server2.use.SSL field is enabled.
Otherwise, it is False.
kerberos_service_name that is the value mentioned in
the kerberos_princ_name field
To create a connection to an existing Cloudera Base on premises Hive or Impala using the
embedded Airflow UI, perform the following steps:
In a text editor, define the following Airflow connection additional arguments
using the configurations copied earlier
The auth_mechanism value must be
GSSAPI for Kerberos authentication.
The use_ssl argument must be true
if the client_services_ssl_enabled field is
enabled for Impala or the
hive.server2.use.SSL field is enabled for
Hive in the Configurations tab. Otherwise, it
must be false.
The use_http_transport value must always be
true.
The http_path value must be
cliservice.
The kerberos_service_name value is the
kerberos_princ_name value in the
Configurations tab.