Placement group support

You can configure a Data Hub cluster on AWS for placement group support. Placement groups support placing VM instances across different physical hardware within a single availability zone. Placement groups can help ensure cluster availability in the event of a physical hardware failure within an availability zone.

In a Data Hub cluster, each host group can be associated with a placement group, of which there are three supported types: partition, spread, or cluster. Cloudera recommends using the partition type as the default for all host groups. For a partition placement group, the partition count will always be 2 and is not configurable.

Configuring placement group support requires adding custom properties to a cluster definition at the host group level. You can define the strategy as “NONE,” “PARTITION,” “SPREAD,” or “CLUSTER.”

For example:

{
  "environmentName": "sample-env",
  "instanceGroups": [
    {
      "nodeCount": 1,
      "name": "master",
      "type": "GATEWAY",
      "recoveryMode": "MANUAL",
      "template": {
        "aws": {
          "encryption": {
            "type": "NONE",
            "key": null
          },
          "placementGroup": {
            "strategy": "PARTITION"
          }
        },
        "instanceType": "m5.2xlarge",
        "rootVolume": {
          "size": 50
        },
        "attachedVolumes": [
          {
            "size": 100,
            "count": 1,
            "type": "standard"
          }
        ],
        "cloudPlatform": "AWS"
      },

Placement groups have a number of rules and limitations. Importantly, it is possible to run out of placement groups within an availability zone. Refer to the AWS documentation for detailed rules and limitations.