Configure WebHDFS for Knox (HA)
REST API access to HDFS in a cluster is provided by WebHDFS. The following
properties for Knox WebHDFS must be enabled in the
/etc/hadoop/conf/hdfs-site.xml
configuration file. The example values
shown in these properties are from an installed instance of the Hortonworks
Sandbox.
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>sandbox.hortonworks.com:8020</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>sandbox.hortonworks.com:50070</value>
</property>
<property>
<name>dfs.https.namenode.https-address</name>
<value>sandbox.hortonworks.com:50470</value>
</property>
The values above must be reflected in each topology descriptor file deployed to the
gateway. The gateway by default includes a sample topology descriptor file located
at {GATEWAY_HOME}/deployments/sandbox.xml
. The values in the
following sample are also configured to work with an installed Hortonworks Sandbox
VM.
<service>
<role>NAMENODE</role>
<url>hdfs://localhost:8020</url>
</service>
<service>
<role>WEBHDFS</role>
<url>http://localhost:50070/webhdfs</url>
</service>
The URL provided for the NAMENODE role does not result in an endpoint being exposed by the gateway. This information is only required so that other URLs can be rewritten that reference the Name Node’s RPC address. This prevents clients from needing to be aware of the internal cluster details.