Configuring Knox With a Secured Hadoop Cluster
Once you have a Hadoop cluster that uses Kerberos for authentication, you must configure Knox to work with that cluster.
To enable the Knox Gateway to interact with a Kerberos-protected Hadoop cluster, add a knox user and Knox Gateway properties to the cluster.
Do the following:
Find the fully-qualified domain name of the host running the gateway:
hostname -f
If the Knox host does not have a static IP address, you can define the knox host as * for local developer testing.
At every Hadoop Master:
Create a UNIX account for Knox:
useradd -g hadoop knox
Edit
core-site.xml
to include the following lines (near the end of the file):<property> <name>hadoop.proxyuser.knox.groups</name> <value>users</value> </property> <property> <name>hadoop.proxyuser.knox.hosts</name> <value>$knox-host</value> </property>
where
$knox-host
is the fully-qualified domain name of the host running the gateway.Edit
webhcat-site.xml
to include the following lines (near the end of the file):<property> <name>hadoop.proxyuser.knox.groups</name> <value>users</value> </property> <property> <name>hadoop.proxyuser.knox.hosts</name> <value>$knox-host</value> </property>
where
$knox_host
is the fully-qualified domain name of the host running the gateway.
At the Oozie host, edit
oozie-site.xml
to include the following lines (near the end of the file):<property> <name>oozie.service.ProxyUserService.proxyuser.knox.groups</name> <value>users</value> </property> <property> <name>oozie.service.ProxyUserService.proxyuser.knox.hosts</name> <value>$knox-host</value> </property>
where
$knox-host
is the fully-qualified domain name of the host running the gateway.At each node running HiveServer2, edit
hive-site.xml
to include the following properties and values:<property> <name>hive.server2.enable.doAs</name> <value>true</value> </property> <property> <name>hive.server2.allow.user.substitution</name> <value>true</value> </property> <property> <name>hive.server2.transport.mode</name> <value>http</value> <description>Server transport mode. "binary" or "http".</description> </property> <property> <name>hive.server2.thrift.http.port</name> <value>10001</value> <description>Port number when in HTTP mode.</description> </property> <property> <name>hive.server2.thrift.http.path</name> <value>cliservice</value> <description>Path component of URL endpoint when in HTTP mode.</description> </property>