Configuring a CDW-Kudu connection

The steps for making the connection involve setting a Impala coordinator property to the fully qualfied domain names (FQDNs) of the Kudu master hosts.

  • You must meet the prerequisites listed in the "Connecting CDW and Kudu".
  • You need to have gathered the FQDNs of Kudu master hosts as described in the previous topic, "Gathering Kudu master FQDNs".
  1. From the Management Console or CDP landing page, navigate to Data Warehouse.
  2. In Virtual Warehouses, select your Impala Virtual Warehouse, and click > Edit.
  3. In Virtual Warehouse Details page, click Configurations > Impala Coordinator.
  4. Select flagfile from the Configuration files drop-down list, and set the kudu_master_hosts key to a list of all Kudu master FQDNs.
    New Virtual Warehouse example:
    kudu_master_hosts=kudu-master1.sandbox.a.cloudera.site:7051,kudu-master2.sandbox.a.cloudera.site:7051,kudu-master3.sandbox.a.cloudera.site:7051
    This screenshot shows the three FQDNs, oversimplified for readability.
    Existing Virtual Warehouse:
    • Click Add Custom Configurations, set kudu_master_hosts to the FQDN of a single master host as shown follows:
      • Configuration Key: kudu_master_hosts
      • Configuration Value: host1:7051
    • Select flagfile from the Configuration files drop-down list, and add the FQDNs of other Kudu master hosts to the value of kudu_master_hosts.
  5. Click Apply Changes.
    When the Virtual Warehouse finishes updating, you can query Kudu tables from Hue, an Impala shell, or an ODBC/JDBC client. For example:
    --- First, create an example table
    CREATE TABLE my_first_kudu_table
    (
      id BIGINT,
      name STRING,
      PRIMARY KEY(id)
    )
    PARTITION BY HASH PARTITIONS 16
    STORED AS KUDU;
    
    --- Next, insert some example data
    INSERT INTO my_first_kudu_table VALUES (101,'A'),(102,'B'),(103,'C');
    
    --- Finally, select from the example table to show the above has worked
    SELECT * FROM my_first_kudu_table;