Transparent Encryption Recommendations for Impala
There are various recommendations to consider when configuring HDFS Transparent Encryption for Impala.
Recommendations
-
If HDFS encryption is enabled, configure Impala to encrypt data spilled to local disk.
-
Limit the rename operations for internal tables once encryption zones are set up. Impala cannot do an
ALTER TABLE RENAME
operation to move an internal table from one database to another, if the root directories for those databases are in different encryption zones. If the encryption zone covers a table directory but not the parent directory associated with the database, Impala cannot do anALTER TABLE RENAME
operation to rename an internal table, even within the same database. -
Avoid structuring partitioned tables where different partitions reside in different encryption zones, or where any partitions reside in an encryption zone that is different from the root directory for the table. Impala cannot do an
INSERT
operation into any partition that is not in the same encryption zone as the root directory of the overall table. -
If the data files for a table or partition are in a different encryption zone than the HDFS trashcan, use the
PURGE
keyword at the end of theDROP TABLE
orALTER TABLE DROP PARTITION
statement to delete the HDFS data files immediately. Otherwise, the data files are left behind if they cannot be moved to the trashcan because of differing encryption zones. This syntax is available in Impala 2.3 and higher.
Steps
Start every impalad
process with the
--disk_spill_encryption=true
flag set. This
encrypts all spilled data using AES-256-CFB. Set this flag by
selecting the Disk Spill Encryption checkbox in
the Impala configuration ( ).
KMS ACL Configuration for Impala
Cloudera recommends making the impala
user a member of the
hive
group, and following the ACL recommendations in KMS ACL
Configuration for Hive.