SSL Certificates Overview

This topic will guide you through the different certificate strategies that you can employ on your cluster to allow SSL clients to securely connect to servers using trusted certificates or certificates issued by trusted authorities. The set of certificates required depends upon the certificate provisioning strategy you implement. The following strategies, among others, are possible:

  • Certificate per host: In this strategy, you obtain one certificate for each host on which at least one SSL daemon role is running. All services on a given host will share this single certificate.
  • Certificate for multiple hosts: Using the SubjectAltName extension, it is possible to obtain a certificate that is bound to a list of specific DNS names. One such certificate could be used to protect all hosts in the cluster, or some subset of the cluster hosts. The advantage of this approach over a wildcard certificate is that it allows you to limit the scope of the certificate to a specific set of hosts. The disadvantage is that it requires you to update and redeploy the certificate whenever a host is added or removed from the cluster.
  • Wildcard certificate: You may also choose to obtain a single wildcard certificate to be shared by all services on all hosts in the cluster. This strategy requires that all hosts belong to the same domain. For example, if the hosts in the cluster have DNS names node1.example.com ... node100.example.com, you can obtain a certificate for *.example.com. Note that only one level of wildcarding is allowed; a certificate bound to *.example.com will not work for a daemon running on node1.subdomain.example.com.
When choosing an approach to certificate provisioning, bear in mind that SSL must be enabled for all core Hadoop services (HDFS, MapReduce, and YARN) as a group. For example, if you are running HDFS and YARN on your cluster, you cannot choose to enable SSL for HDFS, but not for YARN. You must enable it for both services, which implies that you must make certificates available to all daemon roles of both services. With a certificate-per-host strategy, for example, you will need to obtain a certificate for each host on which an HDFS or YARN daemon role is running.