Setting up Apache Hive Connector with Kerberos Authentication

Learn how to configure the Apache Hive connector in Cloudera Octopai Client using Kerberos authentication.

Before configuring the Apache Hive connector in Cloudera Octopai, ensure the following components are available and properly configured:

  • MIT Kerberos for Windows: Download and install from the official MIT Kerberos download page. The default installation path is C:\Program Files\MIT\Kerberos\. To verify the installation, ensure that the following executable file exists in your environment: C:\Program Files\MIT\Kerberos\bin\kinit.exe. This path is configured by default in the kerberos.settings.json file used by the Cloudera Octopai Client.
  • Kerberos Configuration File (krb5.ini): Obtain this file from the Hadoop or Hive cluster administrator and place it under C:\ProgramData\MIT\Kerberos5\krb5.ini. The configuration must include the following sections and values adjusted to the actual cluster environment:
    [libdefaults]
                            default_realm = ROOT.COMOPS.SITE
                            dns_lookup_realm = false
                            dns_lookup_kdc = false
                            ticket_lifetime = 24h
                            renew_lifetime = 7d
                            forwardable = true
                            
                            [realms]
                            ROOT.COMOPS.SITE = {
                            kdc = ccycloud-1.cdp.root.comops.site
                            admin_server = ccycloud-1.cdp.root.comops.site
                            }
                            
                            [domain_realm]
                            .root.comops.site = ROOT.COMOPS.SITE
                            root.comops.site = ROOT.COMOPS.SITE
  • Kerberos Keytab File: Obtain the keytab file from the Hadoop or Hive cluster administrator. The keytab contains encrypted credentials used for Kerberos authentication and enables non-interactive authentication. The file format is binary with .keytab extension. Securely store the keytab file in a location accessible to the Cloudera Octopai Client, for example, at C:\Octopai\keytabs\hive.keytab.
  • Hive ODBC Driver: Download and install a Hive ODBC driver that supports Kerberos authentication. Driver options include the Cloudera or Hortonworks Hive ODBC driver, or a vendor-specific equivalent. Ensure the driver architecture (32-bit or 64-bit) matches the Cloudera Octopai Client installation.
  1. Install and configure the Hive ODBC driver.
    1. Open the ODBC Data Source Administrator:
      • Search for "ODBC Data Source" in the Windows Start menu.
      • Select either 32-bit or 64-bit, depending on the installed driver.
    2. Create a new system DSN:
      • Navigate to the System DSN tab and click Add.
      • Select the Hive ODBC driver (for example, Cloudera ODBC Driver for Hive) and click Finish.
    3. Configure the DSN basic settings:
      • Data Source Name: Provide a user-friendly name (for example, Hive_Kerberos_Prod).
      • Description: (Optional) Add details (for example, "Production Hive with Kerberos authentication").
      • Host: Enter the hostname or IP address of the HiveServer2 service (for example, ccycloud-1.cdp.root.comops.site).
      • Port: Default port is 10000. Confirm the correct value in the cluster configuration.
      • Database: Specify a default database to connect to (for example, default).
    4. Configure the Kerberos authentication parameters:
      • Authentication Mechanism: Kerberos
      • Service Name: hive
      • Realm: Kerberos realm (for example, ROOT.COMOPS.SITE)
      • Host FQDN: Fully qualified domain name of the HiveServer2 service host
      • Kerberos Configuration Path: C:\ProgramData\MIT\Kerberos5\krb5.ini
    5. Configure SSL or TLS (Optional):
      If the cluster requires secure connectivity, enable SSL, and configure the truststore path and password according to the security policy of the cluster.
    6. Configure advanced settings if required by the environment:
      • Connection timeout values
      • Thrift transport, typically SASL for Kerberos
      • Native query execution
    7. Test and save the DSN:
      • Test the connection by clicking Test. A valid Kerberos ticket may be required for the test to succeed.
      • If the test is successful, click OK to save the DSN.
  2. Configure Cloudera Octopai Client for Hive with Kerberos.
    1. Add a Hive connection to Cloudera Octopai Client:
      Launch the Cloudera Octopai Client and click Add New Connection or use the connection wizard.
    2. Configure the Hive connection parameters:
      • Connection Name: Provide a descriptive name (for example, Production Hive Kerberos)
      • Authentication settings:
        • Authentication Type: Kerberos (Kerberos-specific fields will be displayed)
        • Kerberos Principal: for example, hive@ROOT.COMOPS.SITE
        • Keytab Path: Full path to the keytab file (for example, C:\Octopai\keytabs\hive.keytab)
      • If using ODBC, specify the DSN name created earlier (for example, Hive_Kerberos_Prod)
  3. Test the connection.
    • Click Test Connection.
    • During testing, the Cloudera Octopai Client performs the following steps:
      • Acquire a Kerberos ticket using kinit and the provided keytab
      • Attempt to connect to Hive
      • Display the connection status

If the test fails, check the error message and verify the following:

  • Kerberos configuration
  • DSN settings
  • Service availability
  • File permissions