Compute Instance Types

Learn about the supported AWS and Azure compute instance types that you can select while creating a Virtual Warehouse.

Supported AWS compute instance types

Cloudera Data Warehouse supports the following AWS compute instance types (Hive and Impala executors):

Instance type Processor Usage Virtual Warehouse Support
r7gd.4xlarge ARM Compute Hive and Impala
r6gd.4xlarge ARM Compute Hive and Impala
r6id.4xlarge Intel Compute Hive and Impala
r5d.4xlarge Intel Compute (default) Hive and Impala
r5ad.4xlarge AMD Compute Hive and Impala
r5dn.4xlarge Intel Compute Hive and Impala
m5.2xlarge Intel Shared services Hive and Impala

In the Cloudera Data Warehouse environment, instances for shared service components are set up within a Kubernetes cluster. The setup begins with three m5.2xlarge instances running the Cloudera Data Warehouse service, but the Kubernetes cluster is capable of autoscaling, automatically adding more instances if necessary to handle increased demand. Additionally, an Amazon Relational Database Service (RDS), the db.r5.large, running PostgreSQL is created to store user metadata for Hue and Data Visualization services. In total, three shared db.r5.large nodes are used for this purpose. For more information, see Always active, shared services.

Supported Azure compute instance types

Cloudera Data Warehouse supports the following Azure compute instance types (Hive and Impala executors):

Azure VM Processor Type Usage Virtual Warehouse Support
Standard_E16pds_v5 ARM Compute Hive and Impala
Standard_E16_v3 Intel Compute Hive and Impala
Standard_E16ds_v4 Intel Compute (default) Hive and Impala
Standard_E16ads_v5 AMD Compute Hive and Impala
Standard_E16ds_v5 Intel Compute Hive and Impala
Standard_D8s_v4 intel Shared services (default) Hive and Impala
Standard_D8as_v5 Intel Shared services, used with AMD compute instance Standard_E16ads_v5 Hive and Impala

Three instances are added to the cluster, always on components, as needed for shared services. These shared nodes, of the Standard_D8s_v3 type, are dedicated to run the Cloudera Data Warehouse service. The Kubernetes cluster is equipped with autoscaling capabilities, allowing it to automatically add more instances as needed to accommodate increased demand. In addition, an Azure Database for PostgreSQL flexible server, the MemoryOptimized tier, Standard_E2s_v3 instance, is created to store user metadata for Hue and Data Visualization services. For more information, see Always active, shared services.