Setting up Apache Impala Connector with Kerberos Authentication

Learn how to configure the Apache Impala connector in Cloudera Octopai Client using Kerberos authentication.

Before configuring the Apache Impala connector in Cloudera Octopai, ensure the following components are available and properly configured:

  • MIT Kerberos for Windows: Download and install from the official MIT Kerberos download page. The default installation path is C:\Program Files\MIT\Kerberos\. To verify the installation, ensure that the following executable file exists in your environment: C:\Program Files\MIT\Kerberos\bin\kinit.exe. This path is configured by default in the kerberos.settings.json file used by the Cloudera Octopai Client.
  • Kerberos Configuration File (krb5.ini): Obtain this file from the Hadoop or Impala cluster administrator and place it under C:\ProgramData\MIT\Kerberos5\krb5.ini. The configuration must include the following sections and values adjusted to the actual cluster environment:
    [libdefaults]
                            default_realm = ROOT.COMOPS.SITE
                            dns_lookup_realm = false
                            dns_lookup_kdc = false
                            ticket_lifetime = 24h
                            renew_lifetime = 7d
                            forwardable = true
                            
                            [realms]
                            ROOT.COMOPS.SITE = {
                            kdc = ccycloud-1.cdp.root.comops.site
                            admin_server = ccycloud-1.cdp.root.comops.site
                            }
                            
                            [domain_realm]
                            .root.comops.site = ROOT.COMOPS.SITE
                            root.comops.site = ROOT.COMOPS.SITE
  • Kerberos Keytab File: Obtain the keytab file from the Hadoop or Hive cluster administrator. The keytab contains encrypted credentials used for Kerberos authentication and enables non-interactive authentication. The file format is binary with .keytab extension. Securely store the keytab file in a location accessible to the Cloudera Octopai Client, for example, at C:\Octopai\keytabs\impala.keytab.
  • Impala ODBC Driver: Download and install an Impala ODBC driver that supports Kerberos authentication. Driver options include the Cloudera or Hortonworks Impala ODBC driver, or a vendor-specific equivalent. Ensure the driver architecture (32-bit or 64-bit) matches the Cloudera Octopai Client installation.
  1. Install and configure the Impala ODBC driver.
    1. Open the ODBC Data Source Administrator:
      • Search for "ODBC Data Source" in the Windows Start menu.
      • Select either 32-bit or 64-bit, depending on the installed driver.
    2. Create a new system DSN:
      • Navigate to the System DSN tab and click Add.
      • Select the Impala ODBC driver (for example, Cloudera ODBC Driver for Impala) and click Finish.
    3. Configure the DSN basic settings:
      • Data Source Name: Provide a user-friendly name (for example, Impala_Kerberos_Prod).
      • Description: (Optional) Add details (for example, "Production Impala with Kerberos authentication").
      • Host: Enter the hostname or IP address of the Impala service (for example, ccycloud-1.cdp.root.comops.site).
      • Port: Default port is 21050. Confirm the correct value in the cluster configuration.
      • Database: Specify a default database to connect to (for example, default).
    4. Configure the Kerberos authentication parameters:
      • Authentication Mechanism: Kerberos
      • Service Name: impala
      • Realm: Kerberos realm (for example, ROOT.COMOPS.SITE)
      • Host FQDN: Fully qualified domain name of the Impala service host
      • Kerberos Configuration Path: C:\ProgramData\MIT\Kerberos5\krb5.ini
    5. Configure SSL or TLS (Optional):
      If the cluster requires secure connectivity, enable SSL, and configure the truststore path and password according to the security policy of the cluster.
    6. Configure advanced settings if required by the environment:
      • Connection timeout values
      • Thrift transport, typically SASL for Kerberos
      • Native query execution
    7. Test and save the DSN:
      • Test the connection by clicking Test. A valid Kerberos ticket may be required for the test to succeed.
      • If the test is successful, click OK to save the DSN.
  2. Configure Cloudera Octopai Client for Impala with Kerberos.
    1. Add an Impala connection to Cloudera Octopai Client:
      Launch the Cloudera Octopai Client and click Add New Connection or use the connection wizard.
    2. Configure the Impala connection parameters:
      • Connection Name: Provide a descriptive name (for example, Production Impala Kerberos)
      • Authentication settings:
        • Authentication Type: Kerberos (Kerberos-specific fields will be displayed)
        • Kerberos Principal: for example, impala@ROOT.COMOPS.SITE
        • Keytab Path: Full path to the keytab file (for example, C:\Octopai\keytabs\impala.keytab)
      • If using ODBC, specify the DSN name created earlier (for example, Impala_Kerberos_Prod)
  3. Test the connection.
    • Click Test Connection.
    • During testing, the Cloudera Octopai Client performs the following steps:
      • Acquire a Kerberos ticket using kinit and the provided keytab
      • Attempt to connect to Impala
      • Display the connection status

If the test fails, check the error message and verify the following:

  • Kerberos configuration
  • DSN settings
  • Service availability
  • File permissions