Executive summary

This document provides a detailed security reference architecture for Cloudera Private Cloud Base and is accompanied by a series of blog posts. The architecture reflects the four pillars of security engineering best practice, Perimeter, Data, Access and Visibility.

The release of CDP Private Cloud Base has seen a number of significant enhancements to the security architecture including:

  • Apache Ranger for security policy management

  • Updated Ranger Key Management service

This document is intended as a reference guide only and is not a runbook or configuration manual. Cloudera recommends that you undertake your own security design activity in conjunction with Cloudera Professional Services, and where appropriate conduct third party assurances.

Note also that effective security is not just a combination of physical controls and system architecture but also organizational best practices around password/credential management, joiners-movers-leavers processes, group and privilege management, audit and compliance, separation of duties and physical access controls.

Before diving into the technologies it is worth becoming familiar with the key security principle of a layered approach that facilitates defense in depth. Each layer is defined as follows:

These multiple layers of security are applied in order to ensure the confidentiality, integrity and availability of data to meet the most robust of regulatory requirements. CDP Private Cloud Base offers 3 levels of security that implement these features

Level Security Characteristics
0 Non-secure No security configured. Non-secure clusters should never be used in production environments because they are vulnerable to any and all attacks and exploits.
1 Minimal Configured for authentication, authorization, and auditing. Authentication is first configured to ensure that users and services can access the cluster only after proving their identities. Next, authorization mechanisms are applied to assign privileges to users and user groups. Auditing procedures keep track of who accesses the cluster (and how).
2 More Sensitive data is encrypted. Key management systems handle encryption keys. Auditing has been setup for data in the metastore. System metadata is reviewed and updated regularly. Ideally, the cluster has been setup so that lineage for any data object can be traced (data governance).
3 Most The secure cluster is one in which all data, both data-at-rest and data-in-transit, is encrypted and the key management system is fault-tolerant. Auditing mechanisms comply with industry, government, and regulatory standards (PCI, HIPAA, NIST, for example), and extend from the Cluster to the other systems that integrate with it. Cluster administrators are well-trained, security procedures have been certified by an expert, and the cluster can pass technical review.

For the purposes of this document we are going to focus on the most secure level 3 security.