Registering Cloudera Hybrid Environments

Learn how to register a hybrid environment.

You must fulfill the following AWS, Azure, or GCP requirements:

  • Your AWS account must have the right permissions as described in AWS account permissions.
  • You must create the Cross-account Role ARN corresponding to the Cross Account Role as described in Creating cross-account access IAM role.
  • You must create your credential to provision the environment as described in Creating a provisioning credential for AWS.
  • The VPC and subnet requirements must be met as described in VPC and subnets.
  • You must create or use your existing security groups as described in Security groups.
    • If you plan to create new Security Groups, you must have the required IP address range information.
    • If you plan to use existing Security Groups, you must open all the required ports.
  • If you plan to use Customer Managed Encryption Keys (CMEK), you must configure them as described in Customer managed encryption keys.
  • You must create or use your existing SSH public key as described in SSH key pair.
    • If you plan to create a new SSH key, you must use an RSA or ED25519 public key. This will create a new EC2 key pair on the AWS side, and all cloud resources will use it for SSH authentication.
    • If you plan to use an existing SSH key, you must refer to an existing AWS EC2 key pair. The Cloudera Control Plane will validate your key existence
  • You must create an S3 bucket and set up the Logs location as described in AWS cloud storage prerequisites..
  • EnvironmentCreator
  1. Go to Cloudera Management Console.
  2. Select Environments.
  3. Click Register environment.
  4. In the Purpose section, select the Hybrid Cloud Environment option.
  5. Enter the following general information for the new hybrid environment:
    Environments page
    General Information section
    Environment Name Enter a name for the new hybrid environment.
    Description (optional) Enter a short description for the new hybrid environment.
    Select Cloud Provider Select the cloud provider of your choice.
  6. If you already have a credential set up, select it from the drop-down list.
  7. If you must create new credentials, enter or select the following information:
    Environments page
    Amazon Web Services Credential section
    Name Enter a name for the new credential.
    Description (optional) Enter a short description for the new credential.
    Enable Permission Verification Click this toggle to have Cloudera check permissions for your credential. Cloudera will verify that you have the required permissions for your environment.
    Default | Minimal

    Select whether to use Default or Minimal role.

    Use the provided JSON to create the AWS IAM policy.

    Use Minimal role for a general hybrid environment. Use Default if you plan to use Data Services.

    Service Manager Account ID

    External ID

    Use the provided IDs to create the AWS IAM role.
    Cross-account Role ARN Enter the cross-account ARN role.
    SHOW CLI COMMAND (optional) Click this button to display the command required to create the credential from the CLI.
    Create Credential Click this button to create the new credential.

    Click Next to proceed to the Region, Networking and Security step.

    Region, Networking and Security page
    Region, Location section
    Select Region Select the region for the new environment.
    Network section
    Select Network Select the existing virtual network where you want to provision all Cloudera resources. For more information, see VPC and subnet.
    Select Subnets Select existing subnets within the selected VPC. For more information, see VPC and subnet.
    Enable Cluster Connectivity Manager (CCM) Enable or disable Cluster Connectivity Manager. Cluster Connectivity Manager allows Cloudera to communicate with Cloudera Data Hub clusters and on-premises classic clusters that are on private subnets. For more information, see the Cluster Connectivity Manager documentation.
    Enable Endpoint Access Gateway

    When the Cluster Connectivity Manager is enabled, you can optionally enable Endpoint Access Gateway to provide secure connectivity to UIs and APIs in Cloudera Data Hub clusters deployed using private networking.

    From the Select Subnets drop-down list for the Endpoint Access Gateway, select the public subnets for which you want to use the gateway. The number of subnets must be the same as selected under Select Subnets and the availability zones must match.

    For more information, see the Public Endpoint Access Gateway section.

    Encryption section
    Enable Customer Managed Keys Enable this if you want to provide a Customer-Managed Key (CMK) to encrypt the environment disks and databases. For more information, see Customer managed encryption keys.
    Proxies section
    Select Proxy Configuration
    Select one of the following options:
    • Do Not Use Proxy Configuration
    • Create New Proxy Configuration

      Select this option, if you ant Cloudera to automatically create security groups for you and open them to the CIDR range specified.

      Enter the following information for the new proxy configuration:

      • Name
      • Description (optional)
      • Protocol
      • Server Host
      • Server Port
      • No Proxy Hosts
      • Inbound Proxy CIDR
      • Username
      • Password
    • Existing proxy configuration

      You must open all the required ports if you want to use your existing security groups.

    For more information, see Setting up a proxy server.

    Security Access Settings
    Select one of the following options to determine inbound security group settings that allow connections to the Cloudera Data Hub clusters from your organization’s computers:
    • Create New Security Groups

      If you want Cloudera to automatically create security groups for you and open them to the specified CIDR range.

      • Access CIDR

        Enter a custom CIDR IP range for all new security groups that will be created for the Cloudera Data Hub clusters.

    • Select Existing Security Groups

      If you want to use your existing security groups, you must open all the required ports. Refer to Security groups to ensure that you open all ports required for your users to access environment resources.

      • Select Existing Security Group for Gateway Nodes.
      • Select Existing Security Group as default.
    SSH Settings section
    New SSH public key Enter a new SSH public key.
    Existing SSH public key Enter the name of an existing EC2 key pair name with your desired SSH key pair.
    Add tags section
    Add tags (optional) Add tags to be created for your resources on AWS. For more information, see Defining custom tags.
    Advanced Options section
    Network And Availability Click the Enable Multiple Availability Zones for FreeIPA toggle to enable multiple availability zones for FreeIPA For more information, see Deploying Cloudera in multiple AWS availability zones.
    Hardware And Storage Enter FreeIPA nodes instance types. Click the edit icon in the top right corner and select the instance type from the drop-down list. For more information on instance types, see Amazon EC2 instance types.
    Cluster Extensions You can optionally select and attach previously registered recipes to run on a specific FreeIPA host group. For more information, see Recipes.
    Security Select one of the following SELinux modes based on your requirements:
    • Permissive
    • Enforcing

    Click Next to proceed to the Storage step.

    Storage page
    Logs section
    Logger Instance Profile Select the IAM instance profile (or IAM role) that provides Cloudera with write access to the S3 logs data location.
    Logs Location Base Provide a path to an existing S3 bucket or a directory within an existing S3 bucket where log data will be stored.
    Backup Location Base (optional)

    Provide a path to an existing S3 bucket or a directory within an existing S3 bucket where IPA backups will be stored.

    If none is provided, the log location will be used.

    Telemetry section
    Enable Cloudera Observability (optional) When this is enabled, diagnostic information about job and query execution is sent to Cloudera Observability for Cloudera Data Hub clusters. For more information, see Enabling workload analytics and logs collection.
    Register Environment page
    Microsoft Azure Credential section
    Name Enter a name for the new credential.
    Description (optional) Enter a short description for the new credential.
    Default | Minimal

    Select whether to use Default or Minimal role.

    Use the provided JSON to create the AWS IAM policy.

    Use Minimal role for a general hybrid environment. Use Default if you plan to use Data Services.

    Command 1 Use the provided command in the Azure Shell to associate the new certificate with the service principal.
    Command 2 Use the provided command in the Azure Shell to identify your Subscription ID and Tenant ID.
    Show CLI Command (optional) Click this button to display the command required to create the credential from the CLI.
    Create Credential Click this button to create the credential.

    Click Next to proceed to the Region, Networking and Security step.

    Region, Networking and Security page
    Region, Location section
    Select Region Select the region for the new environment.
    Resource Group section
    Select Resource Group

    Select one of the following options:

    • Select an existing resource group to have all Cloudera resources provisioned into that resource group.
    • Select the Create new resource groups option to have Cloudera create multiple resource groups.
    Network section
    Select Network Select the existing virtual network where you want to provision all Cloudera resources. For more information, see VPC and subnet.
    Select Subnets This option is only available if you choose to use an existing network. Multiple subnets must be selected and Cloudera distributes resources evenly within the subnets.
    Enable Cluster Connectivity Manager (CCM) Enable or disable Cluster Connectivity Manager. Cluster Connectivity Manager allows Cloudera to communicate with Cloudera Data Hub clusters and on-premises classic clusters that are on private subnets. For more information, see the Cluster Connectivity Manager documentation.
    Enable Endpoint Access Gateway When Cluster Connectivity Manager is enabled, you can optionally enable Public Endpoint Access Gateway to provide secure connectivity to UIs and APIs in Cloudera Data Hub clusters deployed using private networking.

    If you are using your existing VNET, from the Select Endpoint Access Gateway Subnets drop-down list, select the public subnets for which you want to use the gateway. The number of subnets must match that set under Select Subnets, and the availability zones must match. For more information, see Public Endpoint Access Gateway.

    Create Public IPs This option is disabled by default when Cluster Connectivity Manager is enabled. It is enabled by default when Cluster Connectivity Manager is disabled.
    Database section
    Database
    Select one of the following options:
    • Flexible Server
    • Flexible Server with Private Link

      You must select the Private DNS Zone for the database from the drop-down menu.

    • Flexible Server with Delegated Subnet

    For more information on Flexible Servers, see Using Azure Database for PostgreSQL Flexible Server.

    Encryption section
    Enable Encryption at Host Azure Encryption at Host is a security feature that provides end-to-end encryption for your Virtual Machine (VM) data. Unlike standard encryption that happens at the storage layer, this feature ensures that data is encrypted the moment it is processed by the physical server (the host) where your VM is running.
    Enable Customer Managed Keys Enable this option if you want to provide a Customer-Managed Key (CMK) to encrypt the environment's disks and databases. For more information, see Customer managed encryption keys.
    Proxies section
    Select Proxy Configuration
    Select one of the following options:
    • Do Not Use Proxy Configuration
    • Create New Proxy Configuration

      Select this option if you want Cloudera to automatically create security groups for you and open them to the specified CIDR range.

      Enter the following information for the new proxy configuration:

      • Name
      • Description (optional)
      • Protocol
      • Server Host
      • Server Port
      • No Proxy Hosts
      • Inbound Proxy CIDR
      • Username
      • Password
    • Select existing proxy configuration

      You must open all the required ports if you want to use your existing security groups.

    For more information, see Setting up a proxy server.

    Security Access Settings
    Select one of the following options to determine inbound security group settings that allow connections to the Cloudera Data Hub clusters from your organization computers:
    • Create New Security Groups

      If you want Cloudera to automatically create security groups for you and open them to the CIDR range specified.

      • Access CIDR

        Enter a custom CIDR IP range for all new security groups that will be created for the Cloudera Data Hub clusters.

    • Select Existing Security Groups

      If you want to use your existing security groups. In this case, you must open all the required ports. Refer to Security groups to ensure that you open all ports required for your users to access environment resources.

      • Select Existing Security Group for Gateway Nodes.
      • Select Existing Security Group as default.
    SSH Settings section
    New SSH public key Enter a new SSH public key.
    Existing SSH public key Enter the name of an existing SSH key pair.
    Add tags section
    Add tags (optional) Add tags to be created for your resources on Azure. For more information, see Defining custom tags.
    Advanced Options section
    Network And Availability Click the Enable Multiple Availability Zones for FreeIPA toggle to enable multiple availability zones for FreeIPA. For more information, see Deploying Cloudera in multiple Azure availability zones.
    Hardware And Storage You can specify an instance type for each host group. For more information on instance types, see Sizes for virtual machines in Azure.
    Cluster Extensions You can optionally select and attach previously registered recipes to run on FreeIPA nodes. For more information, see Recipes.
    Security

    Select one of the following SELinux modes based on your requirements:

    • Permissive
    • Enforcing

    Click Next to proceed to the Storage step.

    Storage page
    Logs section
    Logger Instance Profile The logger requires Storage Blob Data Contributor role on the provided storage account.
    Logs Location Base Provide your filesystem and storage account name in a filesystem@storageaccountname.dfs.core.windows.net[/subfolders] format where data will be stored.
    • Filesystem must already exist.
    • The storage account must be Storage V2.
    • Subfolders are optional.
    Backup Location Base (optional) Provide your filesystem and storage account name in a filesystem@storageaccountname.dfs.core.windows.net[/subfolders] format where IPA backups will be stored.
    • Filesystem must already exist.
    • The storage account must be Storage V2.
    • Subfolders are optional.
    Telemetry section
    Enable Cloudera Observability (optional) When this is enabled, diagnostic information about job and query execution is sent to Cloudera Observability for Data Hub clusters. For more information, see Enabling workload analytics and logs collection.
    Register Environment page
    Google Cloud Platform Credential section
    Name Enter a name for the new credential.
    Description (optional) Enter a short description for the new credential.
    Default | Minimal

    Select whether to use Default or Minimal role.

    Use the provided commands to create a service account through the Google Cloud SDK or Google Cloud Shell.

    Use Minimal role for a general hybrid environment. Use Default if you plan to use Data Services.

    Upload file Use the Upload file button to upload a service account private key in JSON format.
    Show CLI Command (optional) Click this button to display the command required to create the credential from the CLI.
    Create Credential Click this button to create the credential.

    Click Next to proceed to the Region, Networking and Security step.

    Region, Networking and Security page
    Region, Location section
    Select Region Select the region for the new environment.
    Select Zone Select the zone within the selected region.
    Network section
    Use Shared VPC Shared VPC allows an organization to connect resources from multiple projects to a common Virtual Private Cloud (VPC) network, so that they can communicate with each other securely and efficiently using internal IPs from that network. When you use Shared VPC, you designate a project as a host project and attach one or more service projects to it. The VPC networks in the host project are called Shared VPC networks. Eligible resources from service projects can use subnets in the Shared VPC network. For more information, see https://cloud.google.com/vpc/docs/shared-vpc
    Select Network Select the existing VPC where you want to provision all Cloudera resources. For more information, see VPC and subnet.
    Select Subnets Select at least one subnet within the selected VPC. For more information, see VPC and subnet.
    Create Private Subnets

    This option is only available if you select to have a new network and subnets created. It is turned on by default so that private subnets are created in addition to public subnets. If you disable it, only public subnets will be created.

    For production deployments, Cloudera recommends using private subnets. Work with your internal IT teams to ensure that users can access the browser interfaces for cluster services.

    Enable Cluster Connectivity Manager (CCM) Enable or disable .Cluster Connectivity Manager Cluster Connectivity Manager allows Cloudera to communicate with Cloudera Data Hub clusters and on-premises classic clusters that are on private subnets. For more information, see the Cluster Connectivity Manager documentation.
    Enable Endpoint Access Gateway When Cluster Connectivity Manager is enabled, you can optionally enable Public Endpoint Access Gateway to provide secure connectivity to UIs and APIs in Cloudera Data Hub clusters deployed using private networking.

    If you are using your existing VPC, under Select Endpoint Access Gateway Subnets, select the public subnets for which you want to use the gateway. The number of subnets must match that set under Select Subnets, and the availability zones must match. For more information, see Public Endpoint Access Gateway.

    Create Public IPs This option is disabled by default when Cluster Connectivity Manager is enabled. It is enabled by default when Cluster Connectivity Manager is disabled.
    Encryption section
    Enable Customer Managed Keys Enable this if you want to provide a Customer-Managed Key (CMK) to encrypt the environment's disks and databases. For more information, see Customer managed encryption keys.
    Proxies section
    Select Proxy Configuration
    Select one of the following options:
    • Do Not Use Proxy Configuration
    • Create New Proxy Configuration

      If you want Cloudera to automatically create security groups for you and open them to the specified CIDR range.

      Enter the following information for the new proxy configuration:

      • Name
      • Description (optional)
      • Protocol
      • Server Host
      • Server Port
      • No Proxy Hosts
      • Inbound Proxy CIDR
      • Username
      • Password
    • Existing proxy configuration

      If you want to use your existing security groups. In this case, you must open all required ports.

    For more information, see Setting up a proxy server.

    Security Access Settings
    Select one of the following options:
    • Do not create firewall rule

      Select this option if you are using a shared VPC and have already set the firewall rules directly on the VPC.

    • Provide existing firewall rules

      If not all of your firewall rules are set directly on the VPC, provide the previously created firewall rules for SSH and UI access. You must select two existing firewall rules, one for Knox gateway-installed nodes and another for all other nodes. You might select the same firewall rule in both places if needed.

    For information on required ports, see Firewall rules.

    SSH Settings section
    New SSH public key Enter a new SSH public key.
    Existing SSH public key Enter the name of an existing SSH key pair.
    Add tags section
    Add tags (optional) Add tags to be created for your resources on GCP. For more information, see Defining custom tags.
    Advanced Options section
    Network And Availability Click the Enable Multiple Availability Zones for FreeIPA toggle to enable multiple availability zones for FreeIPA. For more information, see Deploying Cloudera In Multiple GCP Availability Zones.
    Hardware And Storage You can specify an instance type for each host group. For more information on instance types, see Sizes for virtual machines in Azure.
    Cluster Extensions You can optionally select and attach previously registered recipes to run on FreeIPA nodes.
    Security

    Select one of the following SELinux modes based on your requirements:

    • Permissive
    • Enforcing

    Click Next to proceed to the Storage step.

    Storage page
    Logs section
    Logger Service Profile Select the service account that provides Cloudera with write access to the Google Cloud Storage (GCS) location where logs will be stored.
    Logs Location Base Provide a path to an existing GCS bucket or a directory within an existing GCS bucket where data will be stored. For more information, see Minimum setup for cloud storage.
    Backup Location Base (optional) Provide a path to an existing GCS bucket or a directory within an existing GCS bucket where FreeIPA backups will be stored. For more information, see Minimum setup for cloud storage.
    Telemetry section
    Enable Cloudera Observability (optional) When this is enabled, diagnostic information about job and query execution is sent to Cloudera Observability for Cloudera Data Hub clusters. For more information, see Enabling workload analytics and logs collection.
  8. Click Register Environment to finish the hybrid environment registration process.
You have created the hybrid environment.

After your environment is running, perform the following steps: