Service account for the provisioning credential
The provisioning credential for Google Cloud relies on a service account that can be assumed by Cloudera.
The following flow describes how the Google Cloud provisioning credential works:
- Your GCP account administrator creates a service account and assigns the minimum permissions allowing Cloudera to create and manage resources in your Google Cloud account. Next, the administrator generates a service account access key pair for the service account.
- The service account is registered as a credential in Cloudera and its access key is uploaded to Cloudera.
- The credential is then used for registering your Google Cloud environment in Cloudera.
- Once this is done, Cloudera uses the credential for provisioning environment-related resources, workload clusters, and resources for other Cloudera services that you run in Cloudera.
Review the following to learn about the permissions required for the credential and how to create the service account.
Permissions for the provisioning credential's service account
To allow Cloudera to access and provision resources in your Google Cloud project, you should create a service account in your Google Cloud project, assign the following roles or granular permissions. Next, you generate a JSON access key that can later be provided to Cloudera. Cloudera will assume this service account via the service account access key provided during credential creation for provisioning resources for your environment.
The service account must fulfill one of the following requirements (choose one of the options):
- Option 1: Assign the following IAM roles at the project level. This is a simpler option.
- Option 2: Alternatively, you can create custom IAM roles with the following granular IAM permissions assigned and then assign the role to the service account at the project level. This allows you to minimize the number of permissions granted to Cloudera.
Option 1: IAM roles
IAM role | Scope | Description |
---|---|---|
iam.serviceAccounts.list IAM permission | Project | This is required in order for Cloudera to be able to
list service account names that you created in your GCP project. You need to create a custom role in order to assign this permission. |
Compute Instance Admin (v1) | Project | This is required for provisioning of Compute Engine instances, disks, and images in your VPC. |
Storage Admin | Project | This is required for the creation of a storage bucket to store the Cloudbreak image objects. Delete permissions are not required. |
Compute Network Viewer | Project | This is required for read-only access to all networking resources. |
Compute Load Balancer Admin | Project | This role is required for load balancing between HA components of the Data Lake. |
Cloud SQL Admin | Project | This is required in order for Cloudera to have the permission for creating and deleting a Data Lake and and heavy duty flow management Cloudera Data Hub clusters cleanly. |
Compute Network User | Project | Required for shared VPC only If you would like to use a shared VPC, you need this additional role in the scope of the host project of the VPC. |
Compute Public IP Admin | Project | Required only when not using Cluster Connectivity Manager
This additional role is only required if you are planning to disable Cluster Connectivity Manager for your environment. |
Option 2: Granular permissions
Granular IAM permissions | Scope | Description |
---|---|---|
iam.serviceAccounts.list |
Project | This is required in order for Cloudera to be able to access service accounts that you created. |
iam.serviceAccounts.list cloudsql.instances.create cloudsql.instances.delete cloudsql.instances.get Cloudsql.instances.list cloudsql.databases.update cloudsql.instances.startReplica cloudsql.instances.stopReplica cloudsql.instances.update cloudsql.instances.restart cloudsql.users.create |
Project | Required for creating, stopping, starting, and deleting an external database for the Data Lake and Data Hub clusters. |
compute.addresses.get compute.addresses.use compute.disks.create compute.disks.delete compute.disks.setLabels compute.disks.use compute.firewalls.list compute.globalOperations.get compute.images.create compute.images.get compute.images.list compute.images.useReadOnly compute.instances.create compute.instances.delete compute.instances.get compute.instances.list compute.instances.setLabels compute.instances.setMetadata compute.instances.setServiceAccount compute.instances.setTags compute.instances.start compute.instances.stop compute.machineTypes.list compute.networks.get compute.networks.list compute.regionHealthChecks.useReadOnly compute.regionOperations.get compute.regions.get compute.regions.list compute.subnetworks.get compute.subnetworks.list compute.subnetworks.use compute.subnetworks.useExternalIp compute.zoneOperations.get |
Project | Required for creating VMs from images in your VPC. |
compute.addresses.create compute.addresses.delete compute.addresses.get compute.addresses.use compute.instanceGroups.create compute.instanceGroups.delete compute.instanceGroups.get compute.instanceGroups.list compute.instanceGroups.update compute.instanceGroups.use compute.forwardingRules.create compute.forwardingRules.delete compute.forwardingRules.get compute.forwardingRules.list compute.forwardingRules.setLabels compute.forwardingRules.update compute.forwardingRules.use compute.regionBackendServices.create compute.regionBackendServices.delete compute.regionBackendServices.get compute.regionBackendServices.listcompute.regionBackendServices.update compute.regionBackendServices.use compute.regionHealthChecks.create compute.regionHealthChecks.delete compute.regionHealthChecks.get compute.regionHealthChecks.list compute.regionHealthChecks.update compute.regionHealthChecks.use |
Project | Required for load balancing between HA components of the Data Lake. |
compute.addresses.create compute.addresses.delete compute.addresses.get compute.addresses.use |
Project | (Optional) Only required if public IPs are used. You do not need these permissions if you would like to use private IPs only. |
storage.buckets.create storage.buckets.get storage.buckets.getIamPolicy storage.objects.create storage.objects.delete storage.objects.get storage.objects.getIamPolicy |
Project | (Optional) This is not required if you are planning to pre-create the GCS bucket for storing OS images for VMs. By default, Cloudera creates this bucket, but you can optionally pre-create it. See Storage bucket for OS images. |
Create provisioning credential's service account and generate access key
Create a service account and generate a JSON access key.
Before you begin
Review the above permissions to learn what IAM permissions and IAM roles you need to assign to the service account that you will create.
Steps
-
Log in to your Google Cloud account.
-
Navigate to the project used for Cloudera.
- Navigate to the IAM & Admin.
- To create a custom role:
- Navigate to the Roles page.
- Click +Create Role.
- Specify a Title.
- Specify an ID.
- Click +Add Permissions.
- Add the required granular permission(s).
- Use the same steps to add all the required permissions.
- Click Create.
- To create a service account:
- Navigate to the Service accounts page.
- Click Create service account.
- Enter a service account name.
- Click Create.
- Under Grant this service account access to project, choose the IAM roles to grant to the service account on the project. You need to assign all of the roles listed in the table.
- When you are done adding all the required roles, click Done to finish creating the service account.
- To generate an access key:
- Once your account has been created, find the row of the service account that you want to create a key for. In that row, click the (context menu) button, and then click Create key.
- Under Key type, select JSON and click Create.
- Clicking Create downloads the service account key file. You will use the JSON access key to register the service account as a credential in Cloudera.
- Additionally, once you create the Logger and IDBroker service accounts, you need to update each of these two service accounts to grant the provisioning service account the Service Account User (iam.serviceAccountUser) role. The instructions are provided as part of Minimum setup for cloud storage.
What to do next
Once you have this setup ready, you can Register a GCP credential in Cloudera.