Registering Cloudera Hybrid Environments
Learn how to register a hybrid environment.
Ensure that you have fulfilled the following Cloud Provider requirements based on the Cloud Service Provider of your choice: AWS, Azure, or GCP.
- Ensure that your AWS account has the right permissions as described in AWS account permissions.
- Create the Cross-account Role ARN corresponding to the Cross Account Role as described in Creating cross-account access IAM role.
- Create your credential to provision the environment as described in Creating a provisioning credential for AWS.
- Ensure that the VPC and subnet requirements are met as described in VPC and subnets.
- Create or use your existing security groups as described in
Security groups.
- If you plan to create new Security Groups, make sure you have the required IP address range information.
- If you plan to use existing Security Groups, you need to open all the required ports.
- If you plan to use Customer Managed Encryption Keys (CMEK), configured them as described in Customer managed encryption keys.
- Create or use your existing SSH public key as described in
SSH key pair.
- If you plan to create a new SSH key, make sure you are using an RSA or ED25519 public key. This will create a new EC2 key pair on the AWS side, and all cloud resources will use it for SSH authentication.
- If you plan to use an existing SSH key, you need to refer to an existing AWS EC2 key pair. The Cloudera Control Plane will validate your key existence
- Create an S3 bucket and set up the Logs location as described in AWS cloud storage prerequisites..
- Ensure that your Azure account has the right permissions as described in Azure subscription requirements.
- Create a custom role with the required set of permissions as described in Azure credential prerequisites.
- Create your credential to provision the environment as described in Creating a provisioning credential for Azure.
- Create or let Cloudera create resource groups for the environment as descirbed in Resource groups.
- Ensure that the VPC and subnet requirements are met as described in VNet and subnets.
- Configure Azure Flexible Server as described in Private setup for Azure Flexible Server.
- If you plan to use Customer Managed Encryption Keys (CMEK), configured them as described in Encrypting Azure resources with customer managed keys.
- Create or use your existing security groups as described in
Network security groups.
- If you are planning to create new Security Groups, make sure you have the required IP address range information.
- If you are planning to use existing Security Groups, you need to open all the required ports.
- Create or use your existing SSH public key as described in SSH key pair.
- Create an ADLS Gen2 storage and set up the Logs location as described in Azure cloud storage prerequisites.
- Ensure that your Google account has the right permissions as described in GCP permissions.
- Create a Google project as described in GCP project.
- Create a service account with the required set of permissions as described in Service account for credential.
- Create your credential to provision the environment as described in Creating a GCP credential.
- Ensure that the VPC and subnet requirements are met as described in VPC network and subnet.
- If you plan to use Customer Managed Encryption Keys (CMEK), configured them as described in Customer managed encryption keys.
- Create or use your existing SSH public key as described in SSH key pair.
- Create a Google storage bucket and set up the Logs location as described in GCP cloud storage prerequisites.
- EnvironmentCreator
- Navigate to Cloudera Management Console.
- Select Environments.
- Click Register environment.
- In the Purpose section, select Hybrid Cloud Environment.
-
Enter the following information for the new hybrid environment:
Environments Page General Information Environment Name Enter a name for the new hybrid environment. Description (optional) Enter a short description for the new hybrid environment. Select Cloud Provider Select the cloud provider of your choice. - If you already have a credential set up, select it from the dropdown list.
-
If you need to create new credentials, enter or select the following
information:
Register Environment page Amazon Web Services Credential section Name Enter a name for the new credential. Description (optional) Enter a short description for the new credential. Enable Permission Verification Use this toggle to have Cloudera check permissions for your credential. Cloudera will verify that you have the required permissions for your environment. Default | Minimal Select whether to use Default or Minimal role.
Use the provided JSON to create the AWS IAM policy.
Use Minimal role for a general Hybrid environment. Use Default if you plan to use Data Services.
Service Manager Account ID
External ID
Use the provided IDs to create the AWS IAM role. Cross-account Role ARN Enter the cross-account ARN role. Show CLI Command (optional) Click this button to display the command required to create the credential from the CLI. Create Credential Click this button to create the new credential. Click Next to proceed to the Region, Networking and Security page.
Region, Networking and Security page Region, Location section Select Region Select the region for the new environment. Network section Select Network Select the existing virtual network where you would like to provision all Cloudera resources. For more information, refer to VPC and subnet. Select Subnets Select existing subnets within the selected VPC. For more information, refer to VPC and subnet. Enable Cluster Connectivity Manager (CCM) The Cluster Connectivity Manager allows Cloudera to communicate with Cloudera Data Hub clusters and on-prem classic clusters that are on private subnets. For more information, refer to the Cluster Connectivity Manager documentation. Enable Endpoint Access Gateway When the Cluster Connectivity Manager is enabled, you can optionally enable Endpoint Access Gateway to provide secure connectivity to UIs and APIs in Cloudera Data Hub clusters deployed using private networking.
Under Select Subnets for the Endpoint Access Gateway, select the public subnets for which you would like to use the gateway. The number of subnets must be the same as selected under Select Subnets and the availability zones must match.
For more information, refer to the Public Endpoint Access Gateway section.
Encryption section Enable Customer Managed Keys Enable this if you would like to provide a Customer-Managed Key (CMK) to encrypt the environment's disks and databases. For more information, refer to Customer managed encryption keys. Proxies section Select Proxy Configuration Select one of the following options:- Do Not Use Proxy Configuration
- Create New Proxy Configuration
If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.
Enter the following information for the new proxy configuration:
- Name
- Description (optional)
- Protocol
- Server Host
- Server Port
- No Proxy Hosts
- Inbound Proxy CIDR
- Username
- Password
- Existing proxy configuration
You need to open all the required ports if you would like to use your existing security groups.
For more information, refer to Setting up a proxy server.
Security Access Settings Select one of the following options to determine inbound security group settings that allow connections to the Cloudera Data Hub clusters from your organization’s computers:- Create New Security Groups
If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.
- Access CIDR
Enter a custom CIDR IP range for all new security groups that will be created for the Cloudera Data Hub clusters.
- Access CIDR
-
Select Existing Security Groups
If you would like to use your existing security groups. In this case, you need to open all the required ports. Refer to Security groups to ensure that you open all ports required for your users to access environment resources.
- Select Existing Security Group for Gateway Nodes.
- Select Existing Security Group as default.
SSH Settings section New SSH public key Enter a new SSH public key. Existing SSH public key Enter the name of an existing EC2 key pair name with your desired SSH key pair. Add tags section Add (optional) You can optionally add tags to be created for your resources on AWS. For more information, refer to Defining custom tags. Advanced options section Network And Availability Enable Multiple Availability Zones for FreeIPA. For more information, refer to Deploying Cloudera in multiple AWS availability zones. Hardware And Storage Enter FreeIPA nodes instance types. Click the edit icon in the top right corner and select the instance type from the drop-down list. For more information on instance types, refer to Amazon EC2 instance types. Cluster Extensions You can optionally select and attach previously registered recipes to run on a specific FreeIPA host group. For more information, see Recipes. Security Select the SELinux mode based on your requirements: - Permissive
- Enforcing
Click Next to proceed to the Storage page.
Storage page Logs section Logger Instance Profile Select the IAM instance profile (or IAM role) that provides Cloudera with write access to the S3 logs data location. Logs Location Base Provide a path to an existing S3 bucket or a directory within an existing S3 bucket where log data will be stored. Backup Location Base (optional) Provide a path to an existing S3 bucket or a directory within an existing S3 bucket where IPA backups will be stored.
If none is provided, the log location will be used.
Telemetry section Enable Cloudera Observability (optional) When this is enabled, diagnostic information about job and query execution is sent to Cloudera Observability for Cloudera Data Hub clusters. For more information, refer to Enabling workload analytics and logs collection. Register Environment page Microsoft Azure Credential section Name Enter a name for the new credential. Description (optional) Enter a short description for the new credential. Default | Minimal Select whether to use Default or Minimal role.
Use the provided JSON to create the AWS IAM policy.
Use Minimal role for a general Hybrid environment. Use Default if you plan to use Data Services.
Command 1 Use the provided command in the Azure Shell to associate the new certificate with the service principal. Command 2 Use the provided command in the Azure Shell to identify your Subscription ID and Tenant ID. Show CLI Command (optional) Click this button to display the command required to create the credential from the CLI. Create Credential Click this button to create the credential. Click Next to proceed to the Region, Networking and Security page.
Region, Networking and Security page Region, Location section Select Region Select the region for the new environment. Resource Group section Select Resource Group Select one of the following:
- Select an existing resource group to have all Cloudera resources provisioned into that resource group.
- Select Create new resource groups to have Cloudera create multiple resource groups.
Network section Select Network Select the existing virtual network where you would like to provision all Cloudera resources. Refer to VPC and subnet. Select Subnets This option is only available if you choose to use an existing network. Multiple subnets must be selected and Cloudera distributes resources evenly within the subnets. Enable Cluster Connectivity Manager (CCM) The Cluster Connectivity Manager allows Cloudera to communicate with Cloudera Data Hub clusters and on-prem classic clusters that are on private subnets. For more information, refer to the Cluster Connectivity Manager documentation. Enable Endpoint Access Gateway When Cluster Connectivity Manager is enabled, you can optionally enable Public Endpoint Access Gateway to provide secure connectivity to UIs and APIs in Cloudera Data Hub clusters deployed using private networking. If you are using your existing VPC, under Select Endpoint Access Gateway Subnets, select the public subnets for which you would like to use the gateway. The number of subnets must match that set under Select Subnets, and the availability zones must match. For more information, refer to Public Endpoint Access Gateway.
Create Public IPs This option is disabled by default when Cluster Connectivity Manager is enabled. It is enabled by default when Cluster Connectivity Manager is disabled. Database section Database Select one of the following:- Flexible Server
- Flexible Servier with Private
Link
You must select the Private DNS Zone for the database from the drop-down menu.
- Flexible Server with Delegated Subnet (deprecated)
For more information on Flexible Servers, refer to Using Azure Database for PostgreSQL Flexible Server.
Encryption section Enable Encryption at Host ? Enable Customer Managed Keys Enable this option if you would like to provide a Customer-Managed Key (CMK) to encrypt the environment's disks and databases. For more information, refer to Customer managed encryption keys. Proxies section Select Proxy Configuration Select one of the following:- Do Not Use Proxy Configuration
- Create New Proxy Configuration
If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.
Enter the following information for the new proxy configuration:
- Name
- Description (optional)
- Protocol
- Server Host
- Server Port
- No Proxy Hosts
- Inbound Proxy CIDR
- Username
- Password
- Select existing proxy
configuration
You need to open all the required ports if you would like to use your existing security groups.
For more information, refer to Setting up a proxy server.
Security Access Settings Select one of the following options to determine inbound security group settings that allow connections to the Cloudera Data Hub clusters from your organization’s computers:- Create New Security Groups
If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.
- Access CIDR
Enter a custom CIDR IP range for all new security groups that will be created for the Cloudera Data Hub clusters.
- Access CIDR
-
Select Existing Security Groups
If you would like to use your existing security groups. In this case, you need to open all the required ports. Refer to Security groups to ensure that you open all ports required for your users to access environment resources.
- Select Existing Security Group for Gateway Nodes.
- Select Existing Security Group as default.
SSH Settings section New SSH public key Enter a new SSH public key. Existing SSH public key Enter the name of an existing SSH key pair. Add tags section Add (optional) You can optionally add tags to be created for your resources on Azure. For more information, refer to Defining custom tags. Advanced options section Network And Availability Enable Multiple Availability Zones for FreeIPA. For more information, refer to Deploying Cloudera in multiple Azure availability zones. Hardware And Storage You can specify an instance type for each host group. For more information on instance types, refer to Sizes for virtual machines in Azure. Cluster Extensions You can optionally select and attach previously registered recipes to run on FreeIPA nodes. For more information, see Recipes. Security Select the SELinux mode based on your requirements:
- Permissive
- Enforcing
Click Next to proceed to the Storage page.
Storage page Logs section Logger Instance Profile The logger requires Storage Blob Data Contributor role on the provided storage account. Logs Location Base Provide your filesystem and storage account name in a filesystem@storageaccountname.dfs.core.windows.net[/subfolders] format where data will be stored. - Filesystem must already exist.
- The storage account must be Storage V2.
- Subfolders are optional.
Backup Location Base (optional) Provide your filesystem and storage account name in a filesystem@storageaccountname.dfs.core.windows.net[/subfolders] format where IPA backups will be stored. - Filesystem must already exist.
- The storage account must be Storage V2.
- Subfolders are optional.
Telemetry section Enable Cloudera Observability (optional) When this is enabled, diagnostic information about job and query execution is sent to Cloudera Observability for Data Hub clusters. For more information, refer to Enabling workload analytics and logs collection. Register Environment page Google Cloud Platform Credential section Name Enter a name for the new credential. Description (optional) Enter a short description for the new credential. Default | Minimal Select whether to use Default or Minimal role.
Use the provided commands to create a service account through the Google Cloud SDK or Google Cloud Shell.
Use Minimal role for a general Hybrid environment. Use Default if you plan to use Data Services.
Upload file Use the Upload file button to upload a service account private key in JSON format. Show CLI Command (optional) Click this button to display the command required to create the credential from the CLI. Create Credential Click this button to create the credential. Click Next to proceed to the Region, Networking and Security page.
Region, Networking and Security page Region, Location section Select Region Select the region for the new environment. Select Zone Select the zone within the selected region. Network section Use Shared VPC Shared VPC allows an organization to connect resources from multiple projects to a common Virtual Private Cloud (VPC) network, so that they can communicate with each other securely and efficiently using internal IPs from that network. When you use Shared VPC, you designate a project as a host project and attach one or more service projects to it. The VPC networks in the host project are called Shared VPC networks. Eligible resources from service projects can use subnets in the Shared VPC network. For more information, see https://cloud.google.com/vpc/docs/shared-vpc Select Network Select the existing VPC where you would like to provision all Cloudera resources. Refer to VPC and subnet. Select Subnets Select at least one subnet within the selected VPC. Refer to VPC and subnet. Create Private Subnets This option is only available if you select to have a new network and subnets created. It is turned on by default so that private subnets are created in addition to public subnets. If you disable it, only public subnets will be created.
For production deployments, Cloudera recommends that you use private subnets. Work with your internal IT teams to ensure that users can access the browser interfaces for cluster services.
Enable Cluster Connectivity Manager (CCM) The Cluster Connectivity Manager allows Cloudera to communicate with Cloudera Data Hub clusters and on-prem classic clusters that are on private subnets. For more information, refer to the Cluster Connectivity Manager documentation. Enable Endpoint Access Gateway When Cluster Connectivity Manager is enabled, you can optionally enable Public Endpoint Access Gateway to provide secure connectivity to UIs and APIs in Cloudera Data Hub clusters deployed using private networking. If you are using your existing VPC, under Select Endpoint Access Gateway Subnets, select the public subnets for which you would like to use the gateway. The number of subnets must match that set under Select Subnets, and the availability zones must match. For more information, refer to Public Endpoint Access Gateway.
Create Public IPs This option is disabled by default when Cluster Connectivity Manager is enabled. It is enabled by default when Cluster Connectivity Manager is disabled. Encryption section Enable Customer Managed Keys Enable this if you would like to provide a Customer-Managed Key (CMK) to encrypt the environment's disks and databases. For more information, refer to Customer managed encryption keys. Proxies section Select Proxy Configuration Select one of the following:- Do Not Use Proxy Configuration
- Create New Proxy Configuration
If you would like Cloudera to automatically create security groups for you and open them to the CIDR range specified.
Enter the following information for the new proxy configuration:
- Name
- Description (optional)
- Protocol
- Server Host
- Server Port
- No Proxy Hosts
- Inbound Proxy CIDR
- Username
- Password
- Existing proxy configuration
If you would like to use your existing security groups. In this case, you need to open all required ports.
For more information, refer to Setting up a proxy server.
Security Access Settings You have two options:- Do not create firewall rule
Select this option if you are using a shared VPC and have already set the firewall rules directly on the VPC.
- Provide existing firewall rules
If not all of your firewall rules are set directly on the VPC, provide the previously created firewall rules for SSH and UI access. You should select two existing firewall rules, one for Knox gateway-installed nodes and another for all other nodes. You may select the same firewall rule in both places if needed.
For information on required ports, refer to Firewall rules.
SSH Settings section New SSH public key Enter a new SSH public key. Existing SSH public key Enter the name of an existing SSH key pair. Add tags section Add (optional) You can optionally add tags to be created for your resources on GCP. For more information, refer to Defining custom tags. Advanced options section Network And Availability Enable Multiple Availability Zones for FreeIPA. For more information, refer to Deploying Cloudera In Multiple GCP Availability Zones. Hardware And Storage You can specify an instance type for each host group. For more information on instance types, refer to Sizes for virtual machines in Azure. Cluster Extensions You can optionally select and attach previously registered recipes to run on FreeIPA nodes. Security Select the SELinux mode based on your requirements:
- Permissive
- Enforcing
Click Next to proceed to the Storage page.
Storage page Logs section Logger Service Profile Select the Service Account that provides Cloudera with write access to the Google Cloud Storage location where logs will be stored. Logs Location Base Provide a path to an existing GCS bucket or a directory within an existing GCS bucket where data will be stored. For more information, refer to Minimum setup for cloud storage. Backup Location Base (optional) Provide a path to an existing GCS bucket or a directory within an existing GCS bucket where FreeIPA backups will be stored. For more information, refer to Minimum setup for cloud storage. Telemetry section Enable Cloudera Observability (optional) When this is enabled, diagnostic information about job and query execution is sent to Cloudera Observability for Cloudera Data Hub clusters. For more information, refer to Enabling workload analytics and logs collection. - Click Register Environment to finish the hybrid environment registration process.
After your environment is running, perform the following steps:
- You must assign roles to users and groups to grant them access to the environment, and perform user sync. For steps, refer to Enabling admin and user access to environments.
- You must onboard your users and groups for cloud storage. For steps, refer to Onboarding Cloudera users and groups for cloud storage.
