Configuring Proxy Users to Access HDFS
Hadoop allows you to configure proxy users to submit jobs or access HDFS on behalf of
other users; this is called impersonation. When you enable impersonation, any jobs
submitted using a proxy are executed with the impersonated user's existing privilege levels
rather than those of a superuser (such as hdfs
).
All proxy users are configured in one location, core-site.xml
, for Hadoop
administrators to implement centralized access control.
To configure proxy users, set the hadoop.proxyuser.<proxy_user>.hosts
,
hadoop.proxyuser.<proxy_group>.groups
, and
hadoop.proxyuser.<proxy_user>.users
in core-site.xml
properties.
For example, to allow user alice
to impersonate a user
belonging to group_a
and group_b
, set
hadoop.proxyuser.<proxy_group>.groups
as
follows:
<property> <name>hadoop.proxyuser.alice.groups</name> <value>group_a,group_b</value> </property>
To limit the hosts from which impersonated connections are allowed, use
hadoop.proxyuser.<proxy_user>.hosts
. For example, to
allow user alice
impersonated connections only from
host_a
and host_b
:
<property> <name>hadoop.proxyuser.alice.hosts</name> <value>host_a,host_b</value> </property>
If the configuration properties described are not present, impersonation is not allowed and connections will fail.
For looser restrictions, use a wildcard (*
) to allow
impersonation from any host and of any user. For example, to allow user
bob
to impersonate any user belonging to any group, and
from any host, set the properties as follows:
<property> <name>hadoop.proxyuser.bob.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.bob.groups</name> <value>*</value> </property>
The hadoop.proxyuser.<proxy_user>.hosts
property also accepts
comma-separated lists of IP addresses, IP address ranges in CIDR format, or host names. For
example, to allow user kate
access from hosts in the range
10.222.0.0-15
and 10.113.221.221
, to impersonate
user_a
and user_b
, set the proxy user properties as
follows:
<property> <name>hadoop.proxyuser.super.hosts</name> <value>10.222.0.0/16,10.113.221.221</value> </property> <property> <name>hadoop.proxyuser.super.users</name> <value>user1,user2</value> </property>