The goal of encryption is to ensure that only authorized users can view, use, or contribute to a data set. These security controls add another layer of protection against potential threats by end-users, administrators and other malicious actors on the network. Data protection can be applied at a number of levels within Hadoop:
  • OS Filesystem-level - Encryption can be applied at the Linux operating system file system level to cover all files in a volume. An example of this approach is Cloudera Navigator Encrypt (formerly Gazzang zNcrypt) which is available for Cloudera customers licensed for Cloudera Navigator. Navigator Encrypt operates at the Linux volume level, so it can encrypt cluster data inside and outside HDFS, such as temp/spill files, configuration files and metadata databases (to be used only for data related to a CDH cluster). Navigator Encrypt must be used with Cloudera Navigator Key Trustee Server (formerly Gazzang zTrustee).
  • HDFS-level - Encryption applied by the HDFS client software. HDFS Data At Rest Encryption operates at the HDFS folder level, allowing you to encrypt some folders and leave others unencrypted. Cannot encrypt any data outside HDFS. To ensure reliable key storage (so that data is not lost), Cloudera Navigator Key Trustee Server should be used, while the default Java keystore can be used for test purposes. See Integrating HDFS Encryption with Navigator Key Trustee Server for more information.
  • Network-level - Encryption can be applied to encrypt data just before it gets sent across a network and to decrypt it just after receipt. In Hadoop this means coverage for data sent from client user interfaces as well as service-to-service communication like remote procedure calls (RPCs). This protection uses industry-standard protocols such as TLS/SSL.