Connecting to CDW

The provided examples use Kerberos for authentication when connecting to Cloudera Data Warehouse (CDW) Hive, and Impala, which requires that the Keytab is set and there are proper permissions to access CDW.

In order to get the CDW Hive and Impala JDBC Kerberos URLs:
  1. Go to Data Warehouse > Virtual Warehouses.
  2. Select your Virtual Warehouse.
  3. Copy the JDBC URL.

Connecting to CDW Impala

Python

from impala.dbapi import connect
import os

#jdbc:impala://coordinator-cdw-impala.apps.shared-os-qe-01.kcloud.cloudera.com:443/default;AuthMech=1;transportMode=http;httpPath=cliservice;ssl=1;KrbHostFQDN=dwx-env-rhcxab-env.cdp.local.;KrbServiceName=hive

conn = connect(
            host="coordinator-cdw-impala.apps.shared-os-qe-01.kcloud.cloudera.com", #this gets extracted from the jdbc url
            port=443, #extracted from jdbc url
            auth_mechanism="GSSAPI", #always GSSAPI for Kerberos
            use_http_transport=True, #if transportMode=http in jdbc this is true, otherwise false
            http_path="cliservice",#this will always be cliservice
            use_ssl=True, # if ssl=1 in jdbc set this to true, otherwise false
            kerberos_service_name = "hive", #this will be KrbServiceName in the jdbc url
            krb_host="dwx-env-rhcxab-env.cdp.local.", #this will be the KrbHostFQDN in jdbc url
        )

# Execute using SQL
cursor = conn.cursor()

cursor.execute('show databases')

Connecting to CDW Hive

Python

from impala.dbapi import connect
import os

#jdbc:hive2://hs2-cdw-hive.apps.shared-os-qe-01.kcloud.cloudera.com/default;transportMode=http;httpPath=cliservice;socketTimeout=60;ssl=true;retries=3;kerberosEnableCanonicalHostnameCheck=false;principal=hive/dwx-env-rhcxab-env.cdp.local@QE-AD-1.CLOUDERA.COM

conn = connect(
    host='hs2-cdw-hive.apps.shared-os-qe-01.kcloud.cloudera.com', #copy this from jdbc url
    port=443, #copy this from jdbc url
    use_ssl=True, #if ssl=true in jdbc set this to True, otherwise false
    use_http_transport=True, #if transportMode=http in jdbc set this to true, otherwise false
    kerberos_service_name='hive', #this is in the principal, before the / so in this example it's hive
    auth_mechanism='GSSAPI', #leave this as it is
    http_path="cliservice", #leave this as it is
    krb_host="dwx-env-rhcxab-env.cdp.local", #this is in the principal, the section after / and before @, in this example it's dwx-env-rhcxab-env.cdp.local
)