Registering Cloudera Hybrid Environments

Learn how to register a hybrid environment.

Ensure that you have fulfilled the following Cloud Provider requirements based on the Cloud Service Provider of your choice: AWS, Azure, or GCP.

  • Ensure that your AWS account has the right permissions as described in AWS account permissions.
  • Create the Cross-account Role ARN corresponding to the Cross Account Role as described in Creating cross-account access IAM role.
  • Create your credential to provision the environment as described in Creating a provisioning credential for AWS.
  • Ensure that the VPC and subnet requirements are met as described in VPC and subnets.
  • Create or use your existing security groups as described in Security groups.
    • If you plan to create new Security Groups, make sure you have the required IP address range information.
    • If you plan to use existing Security Groups, you need to open all the required ports.
  • If you plan to use Customer Managed Encryption Keys (CMEK), configured them as described in Customer managed encryption keys.
  • Create or use your existing SSH public key as described in SSH key pair.
    • If you plan to create a new SSH key, make sure you are using an RSA or ED25519 public key. This will create a new EC2 key pair on the AWS side, and all cloud resources will use it for SSH authentication.
    • If you plan to use an existing SSH key, you need to refer to an existing AWS EC2 key pair. The Cloudera Control Plane will validate your key existence
  • Create an S3 bucket and set up the Logs location as described in AWS cloud storage prerequisites..
  • EnvironmentCreator
  1. Navigate to Cloudera Management Console.
  2. Select Environments.
  3. Click Register environment.
  4. In the Purpose section, select Hybrid Cloud Environment.
  5. Enter the following information for the new hybrid environment:
    Environments Page
    General Information
    Environment Name Enter a name for the new hybrid environment.
    Description (optional) Enter a short description for the new hybrid environment.
    Select Cloud Provider Select the cloud provider of your choice.
  6. If you already have a credential set up, select it from the dropdown list.
  7. If you need to create new credentials, enter or select the following information:
    Register Environment page
    Amazon Web Services Credential section
    Name Enter a name for the new credential.
    Description (optional) Enter a short description for the new credential.
    Enable Permission Verification Use this toggle to have Cloudera check permissions for your credential. Cloudera will verify that you have the required permissions for your environment.
    Default | Minimal

    Select whether to use Default or Minimal role.

    Use the provided JSON to create the AWS IAM policy.

    Use Minimal role for a general Hybrid environment. Use Default if you plan to use Data Services.

    Service Manager Account ID

    External ID

    Use the provided IDs to create the AWS IAM role.
    Cross-account Role ARN Enter the cross-account ARN role.
    Show CLI Command (optional) Click this button to display the command required to create the credential from the CLI.
    Create Credential Click this button to create the new credential.

    Click Next to proceed to the Region, Networking and Security page.

    Region, Networking and Security page
    Region, Location section
    Select Region Select the region for the new environment.
    Network section
    Select Network Select the existing virtual network where you would like to provision all Cloudera resources. For more information, refer to VPC and subnet.
    Select Subnets Select existing subnets within the selected VPC. For more information, refer to VPC and subnet.
    Enable Cluster Connectivity Manager (CCM) The Cluster Connectivity Manager allows Cloudera to communicate with Cloudera Data Hub clusters and on-prem classic clusters that are on private subnets. For more information, refer to the Cluster Connectivity Manager documentation.
    Enable Endpoint Access Gateway

    When the Cluster Connectivity Manager is enabled, you can optionally enable Endpoint Access Gateway to provide secure connectivity to UIs and APIs in Cloudera Data Hub clusters deployed using private networking.

    Under Select Subnets for the Endpoint Access Gateway, select the public subnets for which you would like to use the gateway. The number of subnets must be the same as selected under Select Subnets and the availability zones must match.

    For more information, refer to the Public Endpoint Access Gateway section.

    Encryption section
    Enable Customer Managed Keys Enable this if you would like to provide a Customer-Managed Key (CMK) to encrypt the environment's disks and databases. For more information, refer to Customer managed encryption keys.
    Proxies section
    Select Proxy Configuration
    Select one of the following options:
    • Do Not Use Proxy Configuration
    • Create New Proxy Configuration

      If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.

      Enter the following information for the new proxy configuration:

      • Name
      • Description (optional)
      • Protocol
      • Server Host
      • Server Port
      • No Proxy Hosts
      • Inbound Proxy CIDR
      • Username
      • Password
    • Existing proxy configuration

      You need to open all the required ports if you would like to use your existing security groups.

    For more information, refer to Setting up a proxy server.

    Security Access Settings
    Select one of the following options to determine inbound security group settings that allow connections to the Cloudera Data Hub clusters from your organization’s computers:
    • Create New Security Groups

      If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.

      • Access CIDR

        Enter a custom CIDR IP range for all new security groups that will be created for the Cloudera Data Hub clusters.

    • Select Existing Security Groups

      If you would like to use your existing security groups. In this case, you need to open all the required ports. Refer to Security groups to ensure that you open all ports required for your users to access environment resources.

      • Select Existing Security Group for Gateway Nodes.
      • Select Existing Security Group as default.
    SSH Settings section
    New SSH public key Enter a new SSH public key.
    Existing SSH public key Enter the name of an existing EC2 key pair name with your desired SSH key pair.
    Add tags section
    Add (optional) You can optionally add tags to be created for your resources on AWS. For more information, refer to Defining custom tags.
    Advanced options section
    Network And Availability Enable Multiple Availability Zones for FreeIPA. For more information, refer to Deploying Cloudera in multiple AWS availability zones.
    Hardware And Storage Enter FreeIPA nodes instance types. Click the edit icon in the top right corner and select the instance type from the drop-down list. For more information on instance types, refer to Amazon EC2 instance types.
    Cluster Extensions You can optionally select and attach previously registered recipes to run on a specific FreeIPA host group. For more information, see Recipes.
    Security Select the SELinux mode based on your requirements:
    • Permissive
    • Enforcing

    Click Next to proceed to the Storage page.

    Storage page
    Logs section
    Logger Instance Profile Select the IAM instance profile (or IAM role) that provides Cloudera with write access to the S3 logs data location.
    Logs Location Base Provide a path to an existing S3 bucket or a directory within an existing S3 bucket where log data will be stored.
    Backup Location Base (optional)

    Provide a path to an existing S3 bucket or a directory within an existing S3 bucket where IPA backups will be stored.

    If none is provided, the log location will be used.

    Telemetry section
    Enable Cloudera Observability (optional) When this is enabled, diagnostic information about job and query execution is sent to Cloudera Observability for Cloudera Data Hub clusters. For more information, refer to Enabling workload analytics and logs collection.
    Register Environment page
    Microsoft Azure Credential section
    Name Enter a name for the new credential.
    Description (optional) Enter a short description for the new credential.
    Default | Minimal

    Select whether to use Default or Minimal role.

    Use the provided JSON to create the AWS IAM policy.

    Use Minimal role for a general Hybrid environment. Use Default if you plan to use Data Services.

    Command 1 Use the provided command in the Azure Shell to associate the new certificate with the service principal.
    Command 2 Use the provided command in the Azure Shell to identify your Subscription ID and Tenant ID.
    Show CLI Command (optional) Click this button to display the command required to create the credential from the CLI.
    Create Credential Click this button to create the credential.

    Click Next to proceed to the Region, Networking and Security page.

    Region, Networking and Security page
    Region, Location section
    Select Region Select the region for the new environment.
    Resource Group section
    Select Resource Group

    Select one of the following:

    • Select an existing resource group to have all Cloudera resources provisioned into that resource group.
    • Select Create new resource groups to have Cloudera create multiple resource groups.
    Network section
    Select Network Select the existing virtual network where you would like to provision all Cloudera resources. Refer to VPC and subnet.
    Select Subnets This option is only available if you choose to use an existing network. Multiple subnets must be selected and Cloudera distributes resources evenly within the subnets.
    Enable Cluster Connectivity Manager (CCM) The Cluster Connectivity Manager allows Cloudera to communicate with Cloudera Data Hub clusters and on-prem classic clusters that are on private subnets. For more information, refer to the Cluster Connectivity Manager documentation.
    Enable Endpoint Access Gateway When Cluster Connectivity Manager is enabled, you can optionally enable Public Endpoint Access Gateway to provide secure connectivity to UIs and APIs in Cloudera Data Hub clusters deployed using private networking.

    If you are using your existing VPC, under Select Endpoint Access Gateway Subnets, select the public subnets for which you would like to use the gateway. The number of subnets must match that set under Select Subnets, and the availability zones must match. For more information, refer to Public Endpoint Access Gateway.

    Create Public IPs This option is disabled by default when Cluster Connectivity Manager is enabled. It is enabled by default when Cluster Connectivity Manager is disabled.
    Database section
    Database
    Select one of the following:
    • Flexible Server
    • Flexible Servier with Private Link

      You must select the Private DNS Zone for the database from the drop-down menu.

    • Flexible Server with Delegated Subnet (deprecated)

    For more information on Flexible Servers, refer to Using Azure Database for PostgreSQL Flexible Server.

    Encryption section
    Enable Encryption at Host ?
    Enable Customer Managed Keys Enable this option if you would like to provide a Customer-Managed Key (CMK) to encrypt the environment's disks and databases. For more information, refer to Customer managed encryption keys.
    Proxies section
    Select Proxy Configuration
    Select one of the following:
    • Do Not Use Proxy Configuration
    • Create New Proxy Configuration

      If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.

      Enter the following information for the new proxy configuration:

      • Name
      • Description (optional)
      • Protocol
      • Server Host
      • Server Port
      • No Proxy Hosts
      • Inbound Proxy CIDR
      • Username
      • Password
    • Select existing proxy configuration

      You need to open all the required ports if you would like to use your existing security groups.

    For more information, refer to Setting up a proxy server.

    Security Access Settings
    Select one of the following options to determine inbound security group settings that allow connections to the Cloudera Data Hub clusters from your organization’s computers:
    • Create New Security Groups

      If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.

      • Access CIDR

        Enter a custom CIDR IP range for all new security groups that will be created for the Cloudera Data Hub clusters.

    • Select Existing Security Groups

      If you would like to use your existing security groups. In this case, you need to open all the required ports. Refer to Security groups to ensure that you open all ports required for your users to access environment resources.

      • Select Existing Security Group for Gateway Nodes.
      • Select Existing Security Group as default.
    SSH Settings section
    New SSH public key Enter a new SSH public key.
    Existing SSH public key Enter the name of an existing SSH key pair.
    Add tags section
    Add (optional) You can optionally add tags to be created for your resources on Azure. For more information, refer to Defining custom tags.
    Advanced options section
    Network And Availability Enable Multiple Availability Zones for FreeIPA. For more information, refer to Deploying Cloudera in multiple Azure availability zones.
    Hardware And Storage You can specify an instance type for each host group. For more information on instance types, refer to Sizes for virtual machines in Azure.
    Cluster Extensions You can optionally select and attach previously registered recipes to run on FreeIPA nodes. For more information, see Recipes.
    Security

    Select the SELinux mode based on your requirements:

    • Permissive
    • Enforcing

    Click Next to proceed to the Storage page.

    Storage page
    Logs section
    Logger Instance Profile The logger requires Storage Blob Data Contributor role on the provided storage account.
    Logs Location Base Provide your filesystem and storage account name in a filesystem@storageaccountname.dfs.core.windows.net[/subfolders] format where data will be stored.
    • Filesystem must already exist.
    • The storage account must be Storage V2.
    • Subfolders are optional.
    Backup Location Base (optional) Provide your filesystem and storage account name in a filesystem@storageaccountname.dfs.core.windows.net[/subfolders] format where IPA backups will be stored.
    • Filesystem must already exist.
    • The storage account must be Storage V2.
    • Subfolders are optional.
    Telemetry section
    Enable Cloudera Observability (optional) When this is enabled, diagnostic information about job and query execution is sent to Cloudera Observability for Data Hub clusters. For more information, refer to Enabling workload analytics and logs collection.
    Register Environment page
    Google Cloud Platform Credential section
    Name Enter a name for the new credential.
    Description (optional) Enter a short description for the new credential.
    Default | Minimal

    Select whether to use Default or Minimal role.

    Use the provided commands to create a service account through the Google Cloud SDK or Google Cloud Shell.

    Use Minimal role for a general Hybrid environment. Use Default if you plan to use Data Services.

    Upload file Use the Upload file button to upload a service account private key in JSON format.
    Show CLI Command (optional) Click this button to display the command required to create the credential from the CLI.
    Create Credential Click this button to create the credential.

    Click Next to proceed to the Region, Networking and Security page.

    Region, Networking and Security page
    Region, Location section
    Select Region Select the region for the new environment.
    Select Zone Select the zone within the selected region.
    Network section
    Use Shared VPC Shared VPC allows an organization to connect resources from multiple projects to a common Virtual Private Cloud (VPC) network, so that they can communicate with each other securely and efficiently using internal IPs from that network. When you use Shared VPC, you designate a project as a host project and attach one or more service projects to it. The VPC networks in the host project are called Shared VPC networks. Eligible resources from service projects can use subnets in the Shared VPC network. For more information, see https://cloud.google.com/vpc/docs/shared-vpc
    Select Network Select the existing VPC where you would like to provision all Cloudera resources. Refer to VPC and subnet.
    Select Subnets Select at least one subnet within the selected VPC. Refer to VPC and subnet.
    Create Private Subnets

    This option is only available if you select to have a new network and subnets created. It is turned on by default so that private subnets are created in addition to public subnets. If you disable it, only public subnets will be created.

    For production deployments, Cloudera recommends that you use private subnets. Work with your internal IT teams to ensure that users can access the browser interfaces for cluster services.

    Enable Cluster Connectivity Manager (CCM) The Cluster Connectivity Manager allows Cloudera to communicate with Cloudera Data Hub clusters and on-prem classic clusters that are on private subnets. For more information, refer to the Cluster Connectivity Manager documentation.
    Enable Endpoint Access Gateway When Cluster Connectivity Manager is enabled, you can optionally enable Public Endpoint Access Gateway to provide secure connectivity to UIs and APIs in Cloudera Data Hub clusters deployed using private networking.

    If you are using your existing VPC, under Select Endpoint Access Gateway Subnets, select the public subnets for which you would like to use the gateway. The number of subnets must match that set under Select Subnets, and the availability zones must match. For more information, refer to Public Endpoint Access Gateway.

    Create Public IPs This option is disabled by default when Cluster Connectivity Manager is enabled. It is enabled by default when Cluster Connectivity Manager is disabled.
    Encryption section
    Enable Customer Managed Keys Enable this if you would like to provide a Customer-Managed Key (CMK) to encrypt the environment's disks and databases. For more information, refer to Customer managed encryption keys.
    Proxies section
    Select Proxy Configuration
    Select one of the following:
    • Do Not Use Proxy Configuration
    • Create New Proxy Configuration

      If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.

      Enter the following information for the new proxy configuration:

      • Name
      • Description (optional)
      • Protocol
      • Server Host
      • Server Port
      • No Proxy Hosts
      • Inbound Proxy CIDR
      • Username
      • Password
    • Existing proxy configuration

      If you would like to use your existing security groups. In this case, you need to open all required ports.

    For more information, refer to Setting up a proxy server.

    Security Access Settings
    You have two options:
    • Do not create firewall rule

      Select this option if you are using a shared VPC and have already set the firewall rules directly on the VPC.

    • Provide existing firewall rules

      If not all of your firewall rules are set directly on the VPC, provide the previously created firewall rules for SSH and UI access. You should select two existing firewall rules, one for Knox gateway-installed nodes and another for all other nodes. You may select the same firewall rule in both places if needed.

    For information on required ports, refer to Firewall rules.

    SSH Settings section
    New SSH public key Enter a new SSH public key.
    Existing SSH public key Enter the name of an existing SSH key pair.
    Add tags section
    Add (optional) You can optionally add tags to be created for your resources on GCP. For more information, refer to Defining custom tags.
    Advanced options section
    Network And Availability Enable Multiple Availability Zones for FreeIPA. For more information, refer to Deploying Cloudera In Multiple GCP Availability Zones.
    Hardware And Storage You can specify an instance type for each host group. For more information on instance types, refer to Sizes for virtual machines in Azure.
    Cluster Extensions You can optionally select and attach previously registered recipes to run on FreeIPA nodes.
    Security

    Select the SELinux mode based on your requirements:

    • Permissive
    • Enforcing

    Click Next to proceed to the Storage page.

    Storage page
    Logs section
    Logger Service Profile Select the Service Account that provides Cloudera with write access to the Google Cloud Storage location where logs will be stored.
    Logs Location Base Provide a path to an existing GCS bucket or a directory within an existing GCS bucket where data will be stored. For more information, refer to Minimum setup for cloud storage.
    Backup Location Base (optional) Provide a path to an existing GCS bucket or a directory within an existing GCS bucket where FreeIPA backups will be stored. For more information, refer to Minimum setup for cloud storage.
    Telemetry section
    Enable Cloudera Observability (optional) When this is enabled, diagnostic information about job and query execution is sent to Cloudera Observability for Cloudera Data Hub clusters. For more information, refer to Enabling workload analytics and logs collection.
  8. Click Register Environment to finish the hybrid environment registration process.
You have created the Hybrid Environment.

After your environment is running, perform the following steps: