Using a New AWS Region in Cloudera Director

Cloudera Director's AWS support, embodied in a plugin, ships with a predefined, known set of AWS regions. Cloudera adds support for additional regions when possible in new Cloudera Director releases. But, because you may want to use a new region before it has been added to the Cloudera Director plugin for a new release, Cloudera makes it possible to use regions that are not yet listed by default.

For more information about AWS regions, see Regions and Availability Zones in the AWS documentation.

Entering the Region Code

When using its web interface, Cloudera Director asks you which region to use when you define a new AWS environment. You can select the region for EC2, where instances hosting Cloudera Manager and cluster components run, and for RDS, where an external database server can house databases for Cloudera Manager and services like Hive and Oozie.

The region selection widgets are ordinary drop-down menus, but the menus are also editable. To use a region that isn't listed, just type in its region code.

When you use Cloudera Director's configuration file support for defining new deployments and clusters, you don't have any widgets. Simply supply the region code for EC2 and RDS in the expected locations.
  • EC2 region: in the provider section, as the region field
  • RDS region: in the provider section, as the rdsRegion field. If the region is not specified, it defaults to the EC2 region

Region Endpoints

In most cases, Cloudera Director can figure out the AWS endpoints for the different services in a region, so just naming the new region is enough to get things moving. If you receive errors that an AWS service could not be reached, you may need to specify some endpoints, as described below for RDS, IAM, and KMS.

For general information about region endpoints in AWS, see AWS Regions and Endpoints in the AWS documentation.

RDS

If you plan on using RDS, you must supply the RDS endpoint for your chosen region. There are two ways to do this.
  • Using the web UI interface, specify the endpoint URL directly when you define your environment. In the web interface, expand the Advanced Options section under RDS (Relational Database Service) and enter the endpoint URL for RDS region endpoint. In a configuration file, give the URL as the value for the rdsRegionEndpoint field in the provider section. Here is what an endpoint URL looks like:

    rdsRegionEndpoint: https://rds.xy-east-1.amazonaws.com
  • Rather than specifying the RDS endpoint URL with each environment you create, you can supply it in a configuration file that is read by Cloudera Director's AWS support, so it will be used for all environments created with that instance of Cloudera Director. The configuration file is named rds.endpoints.properties and, by default, resides in the directory /var/lib/cloudera-director-plugins/aws-provider-version/etc/. The version number for the aws-provider part of the path changes with most Cloudera Director releases, as the plugin changes version. For example, aws-provider-1.4.1 matches with Cloudera Director 2.4. So the path and file name with Cloudera Director 2.4 would be as follows:
    /var/lib/cloudera-director-plugins/aws-provider-1.4.1/etc/rds.endpoints.properties

Cloudera Director ships with an example of the file that you can use as a template: rds.endpoints.properties.example. Copy this file to a new rds.endpoints.properties file in that directory, and add a line for the RDS endpoint URL, for example:

xy-east-1=https://rds.xy-east-1.amazonaws.com

After adding a new endpoint, restart Cloudera Director if it is running.

IAM

The IAM service is normally accessed using a single, global endpoint that works across all AWS regions. Some regions, however, have their own IAM endpoint. If you are using such a region, supply its custom IAM endpoint. When using the web interface, expand the Advanced Options section under EC2 (Elastic Compute Cloud) on the environment page, and enter the endpoint URL for IAM endpoint. In a configuration file, specify it in the field iamEndpoint in the provider section.
iamEndpoint: https://iam.xy-east-1.amazonaws.com

KMS

Cloudera Director normally computes the expected KMS endpoint for your chosen region. If that process fails, then you can provide the endpoint URL yourself. In the web interface, expand the Advanced Options section under EC2 (Elastic Compute Cloud) on the environment page, and enter the endpoint URL for KMS region endpoint. In a configuration file, specify it in the field kmsEndpoint in the provider section.
kmsEndpoint: https://kms.xy-east-1.amazonaws.com

Other Considerations

A new AWS region usually does not support the full range of services and features that are available in older, established regions. It's important to understand what services and features your chosen region lack, so that you do not request them through Cloudera Director. Cloudera Director does not retain knowledge on which regions have which services available.

Here are some examples of items that can work in older regions but not fully, or at all, in newer ones.
  • AMIs - common "stock" AMIs may not exist for new regions
  • instance types - deprecated instance types are often left out of new regions
  • dedicated instances (tenancy)
  • Spot blocks
  • RDS instance encryption
Cloudera Director triggers operating system updates and performs software downloads on instances it allocates in your chosen region. Depending on the local network configuration, these could go quite slowly or fail. If so, you may need to take some of the following steps.
  • Disable instance normalization. This causes Cloudera Director to not perform usual automated, general work on new instances. You should replace that work with your own, either by building a custom AMI with the work already accomplished, or by using a bootstrap script. Normalization can be disabled using a configuration file; contact Cloudera support for guidance on what else you need to do.

  • Create a preloaded AMI. Cloudera Director can avoid downloading Cloudera Manager and CDH software if it is already present in expected locations on instances. This also speeds up deployment and cluster bootstrap processes, even when download speeds from Cloudera repositories are reasonable. See the documentation for more information.

  • Mirror Cloudera repositories. Instead of preloading an AMI with Cloudera software, you can host them at local mirrors, and point Cloudera Director to them as alternative download locations. As with preloaded AMIs, taking this step can speed up bootstrap processes, and make your architecture less vulnerable to network problems. See the documentation for more information.