Connecting to CDW
The provided examples use Kerberos for authentication when connecting to Cloudera Data Warehouse (CDW) Hive, and Impala, which requires that the Keytab is set and there are proper permissions to access CDW.
In order to get the CDW Hive and Impala JDBC Kerberos URLs:
- Go to .
- Select your Virtual Warehouse.
- Copy the JDBC URL.
Connecting to CDW Impala
Python
from impala.dbapi import connect
import os
#jdbc:impala://coordinator-cdw-impala.apps.shared-os-qe-01.kcloud.cloudera.com:443/default;AuthMech=1;transportMode=http;httpPath=cliservice;ssl=1;KrbHostFQDN=dwx-env-rhcxab-env.cdp.local.;KrbServiceName=hive
conn = connect(
host="coordinator-cdw-impala.apps.shared-os-qe-01.kcloud.cloudera.com", #this gets extracted from the jdbc url
port=443, #extracted from jdbc url
auth_mechanism="GSSAPI", #always GSSAPI for Kerberos
use_http_transport=True, #if transportMode=http in jdbc this is true, otherwise false
http_path="cliservice",#this will always be cliservice
use_ssl=True, # if ssl=1 in jdbc set this to true, otherwise false
kerberos_service_name = "hive", #this will be KrbServiceName in the jdbc url
krb_host="dwx-env-rhcxab-env.cdp.local.", #this will be the KrbHostFQDN in jdbc url
)
# Execute using SQL
cursor = conn.cursor()
cursor.execute('show databases')
Connecting to CDW Hive
Python
from impala.dbapi import connect
import os
#jdbc:hive2://hs2-cdw-hive.apps.shared-os-qe-01.kcloud.cloudera.com/default;transportMode=http;httpPath=cliservice;socketTimeout=60;ssl=true;retries=3;kerberosEnableCanonicalHostnameCheck=false;principal=hive/dwx-env-rhcxab-env.cdp.local@QE-AD-1.CLOUDERA.COM
conn = connect(
host='hs2-cdw-hive.apps.shared-os-qe-01.kcloud.cloudera.com', #copy this from jdbc url
port=443, #copy this from jdbc url
use_ssl=True, #if ssl=true in jdbc set this to True, otherwise false
use_http_transport=True, #if transportMode=http in jdbc set this to true, otherwise false
kerberos_service_name='hive', #this is in the principal, before the / so in this example it's hive
auth_mechanism='GSSAPI', #leave this as it is
http_path="cliservice", #leave this as it is
krb_host="dwx-env-rhcxab-env.cdp.local", #this is in the principal, the section after / and before @, in this example it's dwx-env-rhcxab-env.cdp.local
)