HDP Security Overview
Also available as:

Apache Knox Gateway Overview

A conceptual overview of the Apache Knox Gateway, a reverse proxy.


Knox integrates with Identity Management and SSO systems used in enterprises and allows identity from these systems be used for access to Hadoop clusters.

Knox Gateways provides security for multiple Hadoop clusters, with these advantages:

  • Simplifies access: Extends Hadoop’s REST/HTTP services by encapsulating Kerberos to within the Cluster.

  • Enhances security: Exposes Hadoop’s REST/HTTP services without revealing network details, providing SSL out of the box.

  • Centralized control: Enforces REST API security centrally, routing requests to multiple Hadoop clusters.

  • Enterprise integration: Supports LDAP, Active Directory, SSO, SAML and other authentication systems.

A conceptual diagram showing the connection between the O/JDBC Client, Apache Knox, LDAP, KDC, Ranger, HiveServer2, and HDFS.

Typical Security Flow: Firewall, Routed Through Knox Gateway

Knox can be used with both unsecured Hadoop clusters, and Kerberos secured clusters. In an enterprise solution that employs Kerberos secured clusters, the Apache Knox Gateway provides an enterprise security solution that:

  • Integrates well with enterprise identity management solutions

  • Protects the details of the Hadoop cluster deployment (hosts and ports are hidden from end users)

  • Simplifies the number of services with which a client needs to interact

Knox Gateway Deployment Architecture

Users who access Hadoop externally do so either through Knox, via the Apache REST API, or through the Hadoop CLI tools.

The following diagram shows how Apache Knox fits into a Hadoop deployment.

A conceptual diagram showing how Apache Knox fits into a Hadoop deployment.

NN=NameNode, RM=Resource Manager, DN=DataNote, NM=NodeManager