Streams Messaging cluster layout
The Data Hub service includes three default Streams Messaging cluster definitions. These are the Streams Messaging: Light Duty, Streams Messaging: Heavy Duty cluster, and Streams Messaging: High Availability definitions. Learn about the layout, capacity, and components of these definitions.
Streams Messaging: Light Duty cluster layout
Learn about the layout, capacity, and components of the Streams Messaging: Light Duty cluster definition.
You can use a Streams Messaging: Light Duty cluster definition in development, testing, or proof of concept scenarios. Light Duty clusters include the following nodes and components (services):
Non-default nodes and scaling
Broker and KRaft nodes are not provisioned by default. You have the option to manually set how many of these nodes are created when provisioning the cluster. After the cluster is provisioned, the number of Broker and KRaft nodes can be changed by scaling your cluster. For more information about scaling KRaft and Broker Nodes, see Scaling Streams Messaging Clusters.
KRaft and Zookeeper
By default, Kafka uses Zookeeper as its metadata store. If you provision KRaft nodes in the cluster, Kafka runs in KRaft mode and uses KRaft as its metadata store. In this case, ZooKeeper instances are still provisioned. However, ZooKeeper is not used by Kafka to store and manage metadata. If required, ZooKeeper can be removed from the cluster after the provisioning is finished. ZooKeeper can be removed in Cloudera Manager. For more information, see Deleting ZooKeeper from Streams Messaging clusters
You can only provision an odd number of KRaft nodes. This is required so that KRaft can hold a majority election for leadership. While it is possible to run Kafka in KRaft mode with a single KRaft node, Cloudera recommends that you provision a minimum of three to avoid having a single point of failure. Cluster provisioning fails if an even number of KRaft nodes are provisioned.
Broker and Core Broker volume per instance count
By default, the volume per instance count for Broker and Core Broker nodes is identical. If you customize your cluster during provisioning, Cloudera recommends that Attached Volume per Instances is set to the same value for both node types. Alternatively, if you want to provision a cluster where the number of volumes is not identical, ensure that you complete Configure data directories for clusters with custom disk configurations after the cluster is provisioned. Otherwise, Kafka does not utilize all available volumes. Additionally, scaling the cluster might not be possible.
Default instance and storage configuration
The following table collects the default instance type and storage configuration of the various nodes deployed with the Streams Messaging: Light Duty cluster. For more information about the cloud provider-specific instance and storage types, see the Related Information section.
Node | Instance type | Storage configuration |
---|---|---|
Master | Standard_E8s_v3 | 100 GB Standard Locally-redundant SSD storage |
Core Broker | Standard_D8s_v3 | 1 TB Locally-redundant storage |
Broker | Standard_D8_v3 | 1 TB Locally-redundant storage |
KRaft | Standard_D8_v3 | 100 GB Locally-redundant storage |
Node | Instance type | Storage configuration |
---|---|---|
Master | r5.2xlarge | 100 GB Magnetic |
Core Broker | m5.2xlarge | 1 TB Throughput Optimized HDD |
Broker | m5.2xlarge | 1 TB Throughput Optimized HDD |
KRaft | m5.2xlarge | 100 GB Magnetic |
Node | Instance type | Storage configuration |
---|---|---|
Master | e2-highmem-8 | 100 GB Standard Persistent Disk (HDD) |
Core Broker | e2-standard-8 | 1 TB Standard persistent disks (HDD) |
Broker | e2-standard-8 | 1 TB Standard persistent disks (HDD) |
KRaft | e2-standard-8 | 100 GB Standard Persistent Disk (HDD) |
Streams Messaging: Heavy Duty cluster layout
Learn about the layout, capacity, and components of the Streams Messaging: Heavy Duty cluster definition.
You can use the Streams Messaging: Heavy Duty cluster definition in production scenarios. Heavy Duty clusters include the following nodes and components (services):
Non-default nodes and scaling
SRM, Broker, Connect, and KRaft nodes are not provisioned by default. If you want to have any of these services provisioned, you must manually set the instance count of the appropriate host group to at least one during cluster provisioning. Otherwise, the host group and its nodes are not provisioned. After a cluster is provisioned, you also have the option to scale these nodes. For more information on scaling, see Scaling Streams Messaging Clusters.
KRaft and Zookeeper
By default, Kafka uses Zookeeper as its metadata store. If you provision KRaft nodes in the cluster, Kafka runs in KRaft mode and uses KRaft as its metadata store. In this case, ZooKeeper instances are still provisioned. However, ZooKeeper is not used by Kafka to store and manage metadata. If required, ZooKeeper can be removed from the cluster after the provisioning is finished. ZooKeeper can be removed in Cloudera Manager. For more information, see Deleting ZooKeeper from Streams Messaging clusters
You can only provision an odd number of KRaft nodes. This is required so that KRaft can hold a majority election for leadership. While it is possible to run Kafka in KRaft mode with a single KRaft node, Cloudera recommends that you provision a minimum of three to avoid having a single point of failure. Cluster provisioning fails if an even number of KRaft nodes are provisioned.
Broker and Core Broker volume per instance count
By default, the volume per instance count for Broker and Core Broker nodes is identical. If you customize your cluster during provisioning, Cloudera recommends that Attached Volume per Instances is set to the same value for both node types. Alternatively, if you want to provision a cluster where the number of volumes is not identical, ensure that you complete Configure data directories for clusters with custom disk configurations after the cluster is provisioned. Otherwise, Kafka does not utilize all available volumes. Additionally, scaling the cluster might not be possible.
Default instance and storage configuration
The following table collects the default instance type and storage configuration of the various nodes deployed with the Streams Messaging: Heavy Duty cluster. For more information about the cloud provider-specific instance and storage types, see the Related Information section.
Node | Instance type | Storage configuration |
---|---|---|
Master | Standard_E8s_v3 | 100 GB Standard Locally-redundant SSD storage |
Core Broker | Standard_D8s_v3 | 1 TB Premium locally-redundant storage |
Broker | Standard_D8s_v3 | 1 TB Premium locally-redundant storage |
Registry | Standard_D8_v3 | 100 GB Locally-redundant storage |
SMM | Standard_D8_v3 | 100 GB Locally-redundant storage |
SRM | Standard_D8_v3 | 100 GB Locally-redundant storage |
Connect | Standard_D8_v3 | 100 GB Locally-redundant storage |
KRaft | Standard_D8_v3 | 100 GB Locally-redundant storage |
Node | Instance type | Storage configuration |
---|---|---|
Master | r5.2xlarge | 100 GB Magnetic |
Core Broker | m5.2xlarge | 1 TB General Purpose (GP3 SSD) |
Broker | m5.2xlarge | 1 TB General Purpose (GP3 SSD) |
Registry | m5.2xlarge | 100 GB Magnetic |
SMM | m5.2xlarge | 100 GB Magnetic |
SRM | m5.2xlarge | 100 GB Magnetic |
Connect | m5.2xlarge | 100 GB Magnetic |
KRaft | m5.2xlarge | 100 GB Magnetic |
Node | Instance type | Storage configuration |
---|---|---|
Master | e2-highmem-8 | 100 GB Standard persistent disks (HDD) |
Core Broker | e2-standard-8 | 1 TB Solid-state persistent disks (SSD) |
Broker | e2-standard-8 | 1 TB Solid-state persistent disks (SSD) |
Registry | e2-standard-8 | 100 GB Standard persistent disks (HDD) |
SMM | e2-standard-8 | 100 GB Standard persistent disks (HDD) |
SRM | e2-standard-8 | 100 GB Standard persistent disks (HDD) |
Connect | e2-standard-8 | 100 GB Standard persistent disks (HDD) |
KRaft | e2-standard-8 | 100 GB Standard persistent disks (HDD) |
Streams Messaging: High Availability cluster layout
Learn about the layout, capacity, and components of the Streams Messaging: High Availability cluster definition.
You can use the Streams Messaging: High Availability cluster definition in production scenarios where having a highly available cluster spanning multiple Availability Zones (multi-AZ) is required. High Availability clusters include the following nodes and components (services):
Non-default nodes and scaling
SRM, Broker, Connect, and KRaft nodes are not provisioned by default. If you want to have any of these services provisioned, you must manually set the instance count of the appropriate host group to at least one during cluster provisioning. Otherwise, the host group and its nodes are not provisioned. After a cluster is provisioned, you also have the option to scale these nodes. For more information on scaling, see Scaling Streams Messaging Clusters.
Use multiple subnets for high availability
When using the Streams Messaging High Availability definition, ensure that you select multiple subnets when provisioning the cluster. Otherwise, your cluster will not be highly available.
KRaft and Zookeeper
By default, Kafka uses Zookeeper as its metadata store. If you provision KRaft nodes in the cluster, Kafka runs in KRaft mode and uses KRaft as its metadata store. In this case, ZooKeeper instances are still provisioned. However, ZooKeeper is not used by Kafka to store and manage metadata. If required, ZooKeeper can be removed from the cluster after the provisioning is finished. ZooKeeper can be removed in Cloudera Manager. For more information, see Deleting ZooKeeper from Streams Messaging clusters
You can only provision an odd number of KRaft nodes. This is required so that KRaft can hold a majority election for leadership. While it is possible to run Kafka in KRaft mode with a single KRaft node, Cloudera recommends that you provision a minimum of three to avoid having a single point of failure. Cluster provisioning fails if an even number of KRaft nodes are provisioned.
Broker and Core Broker volume per instance count
By default, the volume per instance count for Broker and Core Broker nodes is identical. If you customize your cluster during provisioning, Cloudera recommends that Attached Volume per Instances is set to the same value for both node types. Alternatively, if you want to provision a cluster where the number of volumes is not identical, ensure that you complete Configure data directories for clusters with custom disk configurations after the cluster is provisioned. Otherwise, Kafka does not utilize all available volumes. Additionally, scaling the cluster might not be possible.
Default instance and storage configuration
The following table collects the default instance type and storage configuration of the various nodes deployed with the Streams Messaging: High Availability cluster. For more information about the cloud provider-specific instance and storage types, see the Related Information section.
Node | Instance type | Storage configuration |
---|---|---|
Master | Standard_E16s_v3 | 100 GB Standard Locally-redundant SSD storage |
Manager 1 | Standard_D16_v3 | 100 GB Locally-redundant storage |
Core Broker | Standard_D8s_v3 | 1 TB Premium locally-redundant storage |
Broker | Standard_D8s_v3 | 1 TB Premium locally-redundant storage |
Core ZooKeeper | Standard_D8s_v3 | 100 GB Locally-redundant storage |
SRM | Standard_D8_v3 | 100 GB Locally-redundant storage |
Connect | Standard_D8_v3 | 100 GB Locally-redundant storage |
KRaft | Standard_D8_v3 | 100 GB Locally-redundant storage |
Node | Instance type | Storage configuration |
---|---|---|
Master | r5.4xlarge | 100 GB Magnetic |
Manager 1 | m5.4xlarge | 100 GB Magnetic |
Core Broker | m5.2xlarge | 1 TB General Purpose (GP3 SSD) |
Broker | m5.2xlarge | 1 TB General Purpose (GP3 SSD) |
Core ZooKeeper | m5.2xlarge | 100 GB Magnetic |
SRM | m5.2xlarge | 100 GB Magnetic |
Connect | m5.2xlarge | 100 GB Magnetic |
KRaft | m5.2xlarge | 100 GB Magnetic |
Node | Instance type | Storage configuration |
---|---|---|
Master | e2-highmem-16 | 100 GB Standard persistent disks (HDD) |
Manager 1 | e2-standard-16 | 100 GB Standard persistent disks (HDD) |
Core Broker | e2-standard-8 | 1 TB Solid-state persistent disks (SSD) |
Broker | e2-standard-8 | 1 TB Solid-state persistent disks (SSD) |
Core ZooKeeper | e2-standard-8 | 100 GB Standard persistent disks (HDD) |
SRM | e2-standard-8 | 100 GB Standard persistent disks (HDD) |
Connect | e2-standard-8 | 100 GB Standard persistent disks (HDD) |
KRaft | e2-standard-8 | 100 GB Standard persistent disks (HDD) |