Connecting to Cloudera Data Warehouse
The provided examples use Kerberos for authentication when connecting to Cloudera Data Warehouse Hive, and Impala, which requires that the Keytab is set and there are proper permissions to access Cloudera Data Warehouse.
In order to get the Cloudera Data Warehouse Hive and Impala JDBC Kerberos URLs:
- Go to .
- Select your Virtual Warehouse.
- Copy the JDBC URL.
Connecting to Cloudera Data Warehouse Impala
Python
from impala.dbapi import connect
import os
#jdbc:impala://coordinator-cdw-impala.apps.shared-os-qe-01.kcloud.cloudera.com:443/default;AuthMech=1;transportMode=http;httpPath=cliservice;ssl=1;KrbHostFQDN=dwx-env-rhcxab-env.cdp.local.;KrbServiceName=hive
conn = connect(
host="coordinator-cdw-impala.apps.shared-os-qe-01.kcloud.cloudera.com", #this gets extracted from the jdbc url
port=443, #extracted from jdbc url
auth_mechanism="GSSAPI", #always GSSAPI for Kerberos
use_http_transport=True, #if transportMode=http in jdbc this is true, otherwise false
http_path="cliservice",#this will always be cliservice
use_ssl=True, # if ssl=1 in jdbc set this to true, otherwise false
kerberos_service_name = "hive", #this will be KrbServiceName in the jdbc url
krb_host="dwx-env-rhcxab-env.cdp.local.", #this will be the KrbHostFQDN in jdbc url
)
# Execute using SQL
cursor = conn.cursor()
cursor.execute('show databases')
Connecting to Cloudera Data Warehouse Hive
Python
from impala.dbapi import connect
import os
#jdbc:hive2://hs2-cdw-hive.apps.shared-os-qe-01.kcloud.cloudera.com/default;transportMode=http;httpPath=cliservice;socketTimeout=60;ssl=true;retries=3;kerberosEnableCanonicalHostnameCheck=false;principal=hive/dwx-env-rhcxab-env.cdp.local@QE-AD-1.CLOUDERA.COM
conn = connect(
host='hs2-cdw-hive.apps.shared-os-qe-01.kcloud.cloudera.com', #copy this from jdbc url
port=443, #copy this from jdbc url
use_ssl=True, #if ssl=true in jdbc set this to True, otherwise false
use_http_transport=True, #if transportMode=http in jdbc set this to true, otherwise false
kerberos_service_name='hive', #this is in the principal, before the / so in this example it's hive
auth_mechanism='GSSAPI', #leave this as it is
http_path="cliservice", #leave this as it is
krb_host="dwx-env-rhcxab-env.cdp.local", #this is in the principal, the section after / and before @, in this example it's dwx-env-rhcxab-env.cdp.local
)