Encryption

The goal of encryption is to ensure that only authorized users can view, use, or contribute to a data set. These security controls add another layer of protection against potential threats by end-users, administrators and other malicious actors on the network. Data protection can be applied at a number of levels within Hadoop:
  • OS Filesystem-level - Encryption can be applied at the Linux operating system file system level to cover all files in a volume. An example of this approach is Cloudera Navigator Encrypt.

    Navigator Encrypt (formerly Gazzang zNcrypt) is production ready and available now for Cloudera customers licensed for Cloudera Navigator. Navigator Encrypt operates at the Linux volume level, so it can encrypt cluster data inside and outside HDFS, such as temp/spill files, configuration files and metadata databases (to be used only for data related to a CDH cluster). Navigator Encrypt must be used with Navigator Key Trustee (formerly Gazzang zTrustee).

  • HDFS-level - Encryption applied by the HDFS client software. The HDFS Data At Rest Encryption feature does this.

    HDFS Data At Rest Encryption ( not production-ready in CDH 5.2) operates at the HDFS folder level, enabling encryption to be applied only to the HDFS folders where it is needed. Cannot encrypt any data outside HDFS. To ensure reliable key storage (so that data is not lost), Navigator Key Trustee should be used, while the default Java keystore can be used for test purposes.

  • Network-level - Encryption can be applied to encrypt data just before it gets sent across a network and to decrypt it as soon as it is received. In Hadoop this means coverage for data sent from client user interfaces as well as service-to-service communication like remote procedure calls (RPCs). This protection uses industry-standard protocols such as SSL/TLS.