Apache Knox Gateway Overview

A conceptual overview of the Apache Knox Gateway, a reverse proxy.


Knox integrates with Identity Management and SSO systems used in enterprises and allows identity from these systems be used for access to Hadoop clusters.

Knox Gateway provides security for multiple Hadoop clusters, with these advantages:

  • Simplifies access: Extends Hadoop’s REST/HTTP services by encapsulating Kerberos to within the Cluster.

  • Enhances security: Exposes Hadoop’s REST/HTTP services without revealing network details, providing SSL out of the box.

  • Centralized control: Enforces REST API security centrally, routing requests to multiple Hadoop clusters.

  • Enterprise integration: Supports LDAP, Active Directory, SSO, SAML and other authentication systems.

Typical Security Flow: Firewall, Routed Through Knox Gateway

Knox can be used with both unsecured Hadoop clusters, and Kerberos secured clusters. In an enterprise solution that employs Kerberos secured clusters, the Apache Knox Gateway provides an enterprise security solution that:

  • Integrates well with enterprise identity management solutions

  • Protects the details of the Hadoop cluster deployment (hosts and ports are hidden from end users)

  • Simplifies the number of services with which a client needs to interact

Knox Gateway Deployment Architecture

Users who access Hadoop externally do so either through Knox, via the Apache REST API, or through the Hadoop CLI tools.