(Deprecated) Creating a connection to Cloudera Data Warehouse for CDW Operator

Learn how to create an Airflow connection to an existing Cloudera Data Warehouse before running the workloads using the CDW Operator.

The following steps are for using the Airflow service provided with each CDE virtual cluster. For information about using your own Airflow deployment, see Using Cloudera Data Engineering with an external Apache Airflow deployment.

To determine the Cloudera Data Warehouse hostname to use for the connection, perform the following steps:

  1. In the Cloudera Data Platform (CDP) management console, click the Data Warehouse tile and click Overview.
  2. In the Virtual Warehouses column, locate the Hive or Impala warehouse you want to connect to.
  3. Click next to the selected Warehouse, and then click Copy JDBC URL.
  4. Paste the URL into a text editor, and make note of the hostname.
    For example,
    jdbc:hive2://hs2-aws-2-hive.env-k5ip0r.dw.ylcu-atmi.cloudera.site/default;transportMode=http;httpPath=cliservice;ssl=true;retries=3;
    In this JDBC URL, the hostname is hs2-aws-2-hive.env-k5ip0r.dw.ylcu-atmi.cloudera.site.

To create a connection to an existing CDW virtual warehouse using the embedded Airflow UI, perform the following steps:

  1. In the Cloudera Data Platform (CDP) console, click the Data Engineering tile. The CDE Home page displays.
  2. Click Administration in the left navigation menu and select the service containing the virtual cluster that you are using.
  3. In the Virtual Clusters column, click Cluster Details for the virtual cluster.
  4. Click AIRFLOW UI.
  5. From the Airflow UI, click the Connection link from the Admin menu.
  6. Click the plus sign to add a new record and fill the following fields:
    • Conn Id: Create a unique connection identifier. For example, cdw-hive-demo.
    • Conn Type: Select Hive Client Wrapper.
    • Host: Enter the hostname copied from the JDBC connection URL. Do not enter the full JDBC URL.
    • Schema: Enter the schema to be used. The default value is default.
    • Login/Password: Enter your workload username and password.
  7. Click Save.