Transparent Encryption Recommendations for Impala
There are various recommendations to consider when configuring HDFS Transparent Encryption for Impala.
Recommendations
-
If HDFS encryption is enabled, configure Impala to encrypt data spilled to local disk.
-
Limit the rename operations for internal tables once encryption zones are set up. Impala cannot do an
ALTER TABLE RENAMEoperation to move an internal table from one database to another, if the root directories for those databases are in different encryption zones. If the encryption zone covers a table directory but not the parent directory associated with the database, Impala cannot do anALTER TABLE RENAMEoperation to rename an internal table, even within the same database. -
Avoid structuring partitioned tables where different partitions reside in different encryption zones, or where any partitions reside in an encryption zone that is different from the root directory for the table. Impala cannot do an
INSERToperation into any partition that is not in the same encryption zone as the root directory of the overall table. -
If the data files for a table or partition are in a different encryption zone than the HDFS trashcan, use the
PURGEkeyword at the end of theDROP TABLEorALTER TABLE DROP PARTITIONstatement to delete the HDFS data files immediately. Otherwise, the data files are left behind if they cannot be moved to the trashcan because of differing encryption zones. This syntax is available in Impala 2.3 and higher.
Steps
Start every impalad process with the
--disk_spill_encryption=true flag set. This
encrypts all spilled data using AES-256-CFB. Set this flag by
selecting the Disk Spill Encryption checkbox in
the Impala configuration ().
KMS ACL Configuration for Impala
Cloudera recommends making the impala user a member of the
hive group, and following the ACL recommendations in KMS ACL
Configuration for Hive.
