Configure Kafka rack awareness

Learn how to configure rack awareness and multi-level rack awareness for Kafka brokers and clients.

Racks provide information about the physical location of a broker or a client. A Kafka deployment can be made rack aware by configuring rack awareness for the Kafka brokers and clients respectively. Enabling rack awareness can help in hardening your deployment, it provides durability guarantees for your Kafka service, and significantly decreases the chances of data loss.

The following tasks give step by step instructions on how you can configure and enable rack awareness and multi-level rack awareness for Kafka brokers and clients. The following covers configuration only. If you want to learn more about the rack awareness feature and how rack-aware Kafka deployments behave, see Kafka rack awareness.

Configuring rack awareness for Kafka brokers

Learn how to configure rack awareness for Kafka brokers.

Rack awareness can be enabled and configured by specifying rack IDs. This can be done in two ways.

You can set the rack IDs on the level of the host machine in Cloudera Manager and configure Kafka to use the rack IDs specified for the host machines. Rack IDs are specified with the Hosts > All Hosts > Actions for Selected > Assign Rack action in Cloudera Manager. Kafka can be configured to use these IDs by selecting the Enable Rack Awareness property.

Alternatively, you can manually specify the rack ID for each broker role using the Broker Rack ID property. In this case, selecting Enable Rack Awareness is not required.

If IDs are specified in Broker Rack ID and Enable Rack Awareness is selected, the ID's specified in Broker Rack ID take precedence.

  • In order for rack awareness to properly function, the brokers in your deployment must be spread across available racks. If all brokers are deployed on the same rack, enabling and configuring rack awareness will not provide you with any benefits.
  • If you previously configured and enabled rack awareness by manually configuring the broker.rack property with Kafka Broker Advanced Configuration Snippet (Safety Valve), ensure that you remove all broker.rack entries from the advanced configuration snippet. The advanced configuration snippet takes precedence and overwrites the configuration set by both Enable Rack Awareness and Broker Rack ID.
  • If you want to specify the rack IDs on the level of the host, you must specify rack information for your hosts before you start the following procedure. Complete Specifying Racks for Hosts.
  • If you are using Broker Rack ID to set rack IDs, ensure that you set the property for each broker instance separately. Do not specify this property on a Kafka service level.
  1. In Cloudera Manager, select the Kafka service.
  2. Configure rack IDs.
    1. Go to Configuration.
    2. Find and select the Enable Rack Awareness property.
    3. Click Save Changes.
    Repeat the following steps for each Kafka broker role instance.
    1. Go to Instances.
    2. Click any of the broker role instances and go to Configuration.
    3. Find and configure the Broker Rack ID property.

      Add the ID of the rack (or other physical infrastructure) that the broker is deployed in. For example, rack1 or DC1

    4. Click Save Changes.
  3. Restart the Kafka service.
Rack awareness is enabled and configured for the Kafka brokers.
Configure rack awareness for Kafka clients.

Configuring multi-level rack awareness for brokers

Learn how to enable and configure multi-level rack awareness for brokers.

Multi-level rack awareness can be enabled and configured by specifying multi-level rack IDs and selecting the Enable multi-level rack awareness Kafka service property.

A multi-level rack ID has a different format than a standard rack ID. It is an absolute path, it starts with a slash ( / ), but does not end in a slash. For example, /uswest/R0 or /DC1/Rack1/HostA. The ID can be specified in two ways.

You can set the rack IDs on the level of the host machine in Cloudera Manager and configure Kafka to use the rack IDs specified for the host machines. Rack IDs are specified with the Hosts > All Hosts > Actions for Selected > Assign Rack action in Cloudera Manager. Kafka can be configured to use these IDs by selecting the Enable Rack Awareness property.

Alternatively, you can manually specify the rack ID for each broker role using the Broker Rack ID property. In this case, selecting Enable Rack Awareness is not required.

If IDs are specified in Broker Rack ID and Enable Rack Awareness is selected, the ID's specified in Broker Rack ID take precedence.

AlI IDs you specify for your brokers must have an identical number of levels in the ID.
  • In order for multi-level rack awareness to properly function, the topology of your deployment must be compatible. That is, the brokers in your deployment must be spread across available physical infrastructure (for example DCs and racks). If your deployment topology is not compatible, enabling and configuring multi-level rack awareness will not provide you with any benefits.
  • If you previously configured and enabled rack awareness by manually configuring the broker.rack property with Kafka Broker Advanced Configuration Snippet (Safety Valve), ensure that you remove all broker.rack entries from the advanced configuration snippet. The advanced configuration snippet takes precedence and overwrites the configuration set by both Enable Rack Awareness and Broker Rack ID.
  • If you want to specify the rack IDs on the level of the host, you must specify rack information for your hosts before you start the following procedure. Complete Specifying Racks for Hosts.
  • If you are using Broker Rack ID to set rack IDs, ensure that you set the property for each broker instance separately. Do not specify this property on a Kafka service level.
  1. In Cloudera Manager, select the Kafka service.
  2. Configure rack IDs.
    1. Go to Configuration.
    2. Find and select the Enable Rack Awareness property.
    3. Click Save Changes.
    Repeat the following steps for each Kafka broker role instance.
    1. Go to Instances.
    2. Click any of the broker role instances and go to Configuration.
    3. Find and configure the Broker Rack ID property.

      Add the ID of the rack (or other physical infrastructure) that the broker is deployed in. Ensure that you set multi-level IDs. For example, /uswest/R0 or /DC1/Rack1/HostA.

    4. Click Save Changes.
  3. Find and select the Enable multi-level rack awareness property.
  4. Restart the Kafka service.

Multi-level rack awareness is enabled for the broker. The brokers create replica assignments with multi-level rack awareness guarantees.

Configure rack awareness for Kafka clients.

Configuring rack awareness for Kafka consumers

Learn how to make Kafka consumers rack aware by enabling and configuring follower fetching.

Kafka Consumers can be made rack aware enabling follower fetching for your Kafka deployment. Follower fetching can be enabled by configuring the replica.selector.class property for the broker and configuring the client.rack property in the consumer’s configuration. The replica.selector.class property is not directly available for configuration in Cloudera Manager and you must use an advanced security snippet to configure it.

Ensure that brokers have rack awareness enabled. For more information, see Configuring rack awareness for Kafka brokers.
  1. In Cloudera Manager, select the Kafka service.
  2. Go to Configuration.
  3. Find the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties property.
  4. Add the following configuration entry to the advanced configuration snippet.
    replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelector
  5. Click Save Changes.
  6. Restart the Kafka service.
  7. Add the following to your consumer configuration.
    client.rack=[***RACK ID***]
    
    Replace [***RACK ID***] with the ID of the rack that the consumer is running in. The rack ID should match one of the rack ID’s you configured for the brokers. Ensure that you configure each consumer and add its corresponding rack ID. If the consumer is deployed in a rack with no brokers, specify the rack ID of a broker that is closest to the rack that the consumer is running in.
Follower fetching is enabled for the Kafka deployment. Kafka consumers are now rack aware and attempt to consume from the replica that is in the closet rack instead of consuming from the replica leader.

Configuring multi-level rack awareness for consumers

Learn how to configure consumer follower fetching in Kafka deployment that has multi-level rack awareness enabled.

If multi-level rack awareness is enabled for the brokers, you can specify a multi-level rack ID for the consumers in their configuration. If a multi-level rack ID is specified for the consumer, the consumer will fetch messages from the closest replica in the multi-level hierarchy based on the ID it is provided. The rack ID can be set with the client.rack property in the consumer’s configuration.
Ensure that brokers have multi-level rack awareness enabled. For more information, see Configuring multi-level rack awareness for brokers.
  1. Choose an appropriate multi-level rack ID for the consumer.
    A multi-level rack ID is an absolute path, it starts with a slash ( / ) but does not end in a slash. The exact ID that you specify depends on the levels of hierarchy in your deployment (cluster topology) and the rack IDs specified for the broker.

    With standard rack awareness and follower fetching, the IDs you specify for the consumer must exactly match the ID specified for the brokers. With multi-level rack awareness, this is not the case. The levels of hierarchy represented in the ID of the consumer can differ compared to the broker rack IDs. Only a partial match is required to find a closest replica.

    For example, assume you have a two level hierarchy, DCs on the top level, racks on the second level. Additionally, assume there are two replicas. They are located in /DC1/R1 and /DC2/R3. Depending on the configuration of client.rack, the consumer will select different replicas. The behavior is as follows:
    • If client.rack is /DC1, the /DC1/R1 replica is selected.
    • If client.rack is /DC2/R5, the /DC2/R3 replica is selected.
    • If client.rack is /DC2/R3, the /DC2/R3 replica is selected.
    • If client.rack is not set, the leader is selected.
    • If client.rack is random_id, the leader is selected.
    • If client.rack is DC3, the leader is selected.
  2. Set the client.rack property.
    For example:
    client.rack=/DC1/R1
Consumer IDs are configured. The consumers will fetch data from the closest replica in the multi-level deployment.

Configuring rack awareness for Kafka producers

Learn how to enable and configure rack awareness for Kafka producers.

Enabling rack awareness for Kafka producers involves configuring your Kafka deployment in a way that ensures that producers commit messages to at least two separate brokers that are deployed on different racks. This can be done by configuring your producers to provide the highest available guarantee on message delivery and configuring min.insync.replicas for your topics.

Ensure that brokers have rack awareness enabled. For more information, see Configuring rack awareness for Kafka brokers.
  1. Add the following to your producer configuration.
    acks=all

    This is the default configuration for producer version 3.0.0 or later. As a result, configuring this property might not be required.

  2. Configure min.insync.replicas for the produced topics to a value that ensures the desired number of racks (minimum of 2) get the message before the produce request is considered successful.
Rack awareness for Kafka producers is configured. Producers will now ensure that messages are produced to at least 2 (or more) of the available racks.