4. Configuring Knox With a Secured Hadoop Cluster

Once you have a Hadoop cluster that uses Kerberos for authentication, you must configure Knox to work with that cluster.

To enable the Knox Gateway to interact with a Kerberos-protected Hadoop cluster, add a knox user and Knox Gateway properties to the cluster.

Do the following:

  1. Find the fully-qualified domain name of the host running the gateway:

    hostname -f

    If the Knox host does not have a static IP address, you can define the knox host as * for local developer testing.

  2. At every Hadoop Master:

    • Create a UNIX account for Knox:

      useradd -g hadoop knox

    • Edit core-site.xml to include the following lines (near the end of the file):

      <property>
       <name>hadoop.proxyuser.knox.groups</name>
       <value>users</value>
      </property>
      
      <property>
       <name>hadoop.proxyuser.knox.hosts</name>
       <value>$knox-host</value>
      </property>

      where $knox-host is the fully-qualified domain name of the host running the gateway.

    • Edit webhcat-site.xml to include the following lines (near the end of the file):

      <property>
       <name>hadoop.proxyuser.knox.groups</name>
       <value>users</value>
      </property>
      
      <property>
       <name>hadoop.proxyuser.knox.hosts</name>
       <value>$knox-host</value>
      </property>

      where $knox_host is the fully-qualified domain name of the host running the gateway.

  3. At the Oozie host, edit oozie-site.xml to include the following lines (near the end of the file):

    <property>
     <name>oozie.service.ProxyUserService.proxyuser.knox.groups</name>
     <value>users</value>
    </property>
    
    <property>
     <name>oozie.service.ProxyUserService.proxyuser.knox.hosts</name>
     <value>$knox-host</value>
    </property>

    where $knox-host is the fully-qualified domain name of the host running the gateway.

  4. At each node running HiveServer2, edit hive-site.xml to include the following properties and values:

    <property>
     <name>hive.server2.enable.doAs</name>
     <value>true</value>
    </property>
    
    <property>
     <name>hive.server2.allow.user.substitution</name>
     <value>true</value>
    </property>
    
    <property>
     <name>hive.server2.transport.mode</name>
     <value>http</value>
     <description>Server transport mode. "binary" or "http".</description>
    </property>
    
    <property>
     <name>hive.server2.thrift.http.port</name>
     <value>10001</value>
     <description>Port number when in HTTP mode.</description>
    </property>
    
    <property>
     <name>hive.server2.thrift.http.path</name>
     <value>cliservice</value>
     s<description>Path component of URL endpoint when in HTTP mode.</description>
    </property>

loading table of contents...