Hadoop Security Guide
Also available as:
PDF
loading table of contents...

Enable SSL for Accumulo

One of the major features added in Accumulo 1.6.0 was the ability to configure Accumulo so that the Thrift communications will run over SSL. Apache Thrift is the remote procedure call library that is leveraged for both intra-server and client communication with Accumulo. Issuing these calls over a secure socket ensures that unwanted actors cannot inspect the traffic sent across the wire. Given the sometimes sensitive nature of data stored in Accumulo and the authentication details for users, secure communications are critical.

Due to the complex and deployment-specific nature of the security model for some systems, Accumulo expects users to provide their own certificates, guaranteeing that they are, in fact, secure. However, for those who require security but do not already operate within the confines of an established security infrastructure, OpenSSL and the Java keytool command can be used to generate the necessary components to enable wire encryption.

To enable SSL with Accumulo, it is necessary to generate a certificate authority and certificates that are signed by that authority. Typically, each client and server has its own certificate, which provides the finest level of control over a secure cluster when the certificates are properly secured.

Generate a Certificate Authority

The certificate authority (CA) controls what certificates can be used to authenticate with each other. To create a secure connection with two certificates, each certificate must be signed by a certificate authority in the "truststore" (A Java KeyStore which contains at least one Certificate Authority's public key). When creating your own certificate authority, a single CA is typically sufficient (and would result in a single public key in the truststore). Alternatively, a third party can also act as a certificate authority (to add an additional layer of security); however, these are typically not a free service.

The following is an example of creating a certificate authority and adding its public key to a Java KeyStore to provide to Accumulo.

# Create a private key
openssl genrsa -des3 -out root.key 4096

# Create a certificate request using the private key
openssl req -x509 -new -key root.key -days 365 -out root.pem

# Generate a Base64-encoded version of the PEM just created
openssl x509 -outform der -in root.pem -out root.der

# Import the key into a Java KeyStore
keytool -import -alias root-key -keystore truststore.jks -file root.der

# Remove the DER formatted key file (as we don't need it anymore)
rm root.der

Remember to protect root.key and never distribute it, as the private key is the basis for your circle of trust. The keytool command will prompt you about whether or not the certificate should be trusted: enter "yes". The truststore.jks file, a "truststore", is meant to be shared with all parties communicating with one another. The password provided to the truststore verifies that the contents of the truststore have not been tampered with.

Generate a Certificate/Keystore Per Host

It is desirable to generate a certificate for each host in the system. Additionally, each client connecting to the Accumulo instance running with SSL should be issued its own certificate. Issuing individual certificates to each entity provides proper control to revoke/reissue certificates to clients as necessary, without widespread interruption.

The following commands create a private key for the server, generate a certificate signing request created from that private key, use the certificate authority to generate the certificate using the signing request. and then create a Java KeyStore with the certificate and the private key for our server.

# Create the private key for our server 
openssl genrsa -out server.key 4096 

# Generate a certificate signing request (CSR) with our private key 
openssl req -new -key server.key -out server.csr 

# Use the CSR and the CA to create a certificate for the server (a reply to the CSR) 
openssl x509 -req -in server.csr -CA root.pem -CAkey root.key -CAcreateserial -out server.crt -days 365

# Use the certificate and the private key for our server to create PKCS12 file 
openssl pkcs12 -export -in server.crt -inkey server.key -certfile server.crt -name 'server-key' -out server.p12

# Create a Java KeyStore for the server using the PKCS12 file (private key) 
keytool -importkeystore -srckeystore server.p12 -srcstoretype pkcs12 -destkeystore server.jks -deststoretype JKS

# Remove the PKCS12 file as we don't need it 
rm server.p12 

# Import the CA-signed certificate to the keystore 
keytool -import -trustcacerts -alias server-crt -file server.crt -keystore server.jks 

This, combined with the truststore, provides what is needed to configure Accumulo servers to run over SSL. The private key (server.key), the certificate signed by the CA (server.pem), and the keystore (server.jks) should be restricted to only be accessed by the user running Accumulo on the host it was generated for. Use chown and chmod to protect the files, and do not distribute them over non-secure networks.

Configure Accumulo Servers

Now that the Java KeyStores have been created with the necessary information, the Accumulo configuration must be updated so that Accumulo creates the Thrift server over SSL instead of a normal socket. Configure the following properties in accumulo-site.xml:

<property>
  <name>rpc.javax.net.ssl.keyStore</name>
  <value>/path/to/server.jks</value>
</property>
<property>
  <name>rpc.javax.net.ssl.keyStorePassword</name>
  <value>server_password</value>
</property>
<property>
  <name>rpc.javax.net.ssl.trustStore</name>
  <value>/path/to/truststore.jks</value>
</property>
<property>
  <name>rpc.javax.net.ssl.trustStorePassword</name>
  <value>truststore_password</value>
</property>
<property>
  <name>instance.rpc.ssl.enabled</name>
  <value>true</value>
</property>

The keystore and truststore paths are both absolute paths on the local file system (not HDFS). Remember that the server keystore should only be readable by the user running Accumulo and, if you place plain-text passwords in accumulo-site.xml, make sure that accumulo-site.xml is also not globally readable. To keep these passwords out of accumulo-site.xml, consider configuring your system with the new Hadoop CredentialProvider class. See ACCUMULO-2464 for more information on what will be available in Accumulo-1.6.1.

Also, be aware that if unique passwords are used for each server when generating the certificate, this will result in different accumulo-site.xml files for each host. Unique configuration files for each host will add complexity to the configuration management of your instance. The use of a CredentialProvider (a feature from Hadoop which allows for acquisitions of passwords from alternate systems) can help alleviate the issues with unique accumulo-site.xml files on each host. A Java KeyStore can be created using the CredentialProvider tools, which eliminates the need for passwords to be stored in accumulo-site.xml, and can instead point to the CredentialProvider URI which is consistent across hosts.

Configure Accumulo Clients

To configure Accumulo clients, use $HOME/.accumulo/config. This is a simple Java properties file: each line is a configuration, key, and value separated by a space, and lines beginning with a # symbol are ignored. For example, if we generated a certificate and placed it in a keystore (as described above), we would generate the following file for the Accumulo client.

instance.rpc.ssl.enabled true
rpc.javax.net.ssl.keyStore  /path/to/client-keystore.jks
rpc.javax.net.ssl.keyStorePassword  client-password
rpc.javax.net.ssl.trustStore  /path/to/truststore.jks
rpc.javax.net.ssl.trustStorePassword  truststore-password

When creating a ZooKeeperInstance, the implementation will automatically look for this configuration file and set up a connection with the methods defined in this file. The ClientConfiguration class also contains methods that can be used instead of a configuration file on the file system. Again, the paths to the keystore and truststore are on the local file system, not HDFS.