Configuring Encryption
The goal of encryption is to ensure that only authorized users can view, use, or contribute to a data set. These security
controls add another layer of protection against potential threats by end-users, administrators, and other malicious actors on the network. Data protection can be applied at a number of levels within
Hadoop:
- OS Filesystem-level - Encryption can be applied at the Linux operating system filesystem level to cover all files in a volume. An example of this approach
is Cloudera Navigator Encrypt (formerly Gazzang zNcrypt) which is available for Cloudera customers licensed for Cloudera
Navigator. Navigator Encrypt operates at the Linux volume level, so it can encrypt cluster data inside and outside HDFS, such as temp/spill files, configuration files and metadata databases (to be
used only for data related to a CDH cluster). Navigator Encrypt must be used with Cloudera Navigator Key Trustee Server (formerly
Gazzang zTrustee).
CDH components, such as Impala, MapReduce, YARN, or HBase, also have the ability to encrypt data that lives temporarily on the local filesystem outside HDFS. To enable this feature, see Configuring Encryption for Data Spills.
- Network-level - Encryption can be applied to encrypt data just before it gets sent across a network and to decrypt it just after receipt. In Hadoop, this means coverage for data sent from client user interfaces as well as service-to-service communication like remote procedure calls (RPCs). This protection uses industry-standard protocols such as TLS/SSL.
- HDFS-level - Encryption applied by the HDFS client software. HDFS Transparent Encryption operates at the HDFS folder level, allowing you to encrypt some folders and leave others
unencrypted. HDFS transparent encryption cannot encrypt any data outside HDFS. To ensure reliable key storage (so that data is not lost), use Cloudera Navigator Key Trustee Server; the default Java
keystore can be used for test purposes. For more information, see Enabling HDFS Encryption Using Cloudera Navigator Key
Trustee Server.
Unlike OS and network-level encryption, HDFS transparent encryption is end-to-end. That is, it protects data at rest and in transit, which makes it more efficient than implementing a combination of OS-level and network-level encryption.
Continue reading:
- TLS/SSL Certificates Overview
- Configuring TLS Security for Cloudera Manager
- Configuring TLS/SSL for the Cloudera Navigator Data Management Component
- Configuring TLS/SSL for Publishing Cloudera Navigator Audit Events to Kafka
- Configuring TLS/SSL for Cloudera Management Service Roles
- Configuring TLS/SSL Encryption for CDH Services
- Deployment Planning for Data at Rest Encryption
- HDFS Transparent Encryption
- Cloudera Navigator Key Trustee Server
- Cloudera Navigator Key HSM
- Cloudera Navigator Encrypt
- Configuring Encryption for Data Spills
- Configuring Encrypted HDFS Data Transport
- Configuring Encrypted HBase Data Transport