Planning for Apache ImpalaPDF version

User Account Requirements

Impala creates and uses a user and group named impala.

Do not delete this account or group and do not modify the account's or group's permissions and rights. Ensure no existing systems obstruct the functioning of these accounts and groups. For example, if you have scripts that delete user accounts not in a white-list, add these accounts to the list of permitted accounts.

For correct file deletion during DROP TABLE operations, Impala must be able to move files to the HDFS trashcan. You might need to create an HDFS directory /user/impala, writeable by the impala user, so that the trashcan can be created. Otherwise, data files might remain behind after a DROP TABLE statement.

Impala should not run as root. Best Impala performance is achieved using direct reads, but root is not permitted to use direct reads. Therefore, running Impala as root negatively affects performance.

By default, any user can connect to Impala and access all the associated databases and tables. You can enable authorization and authentication based on the Linux OS user who connects to the Impala server, and the associated groups for that user. These security features do not change the underlying file permission requirements. The impala user still needs to be able to access the data files.