IPv6 Support and Dual-Stack Configuration

Cloudera Base on premises 7.3.2.0 release introduces dual-stack support for a specific set of services. These services can now listen on both IPv4 and IPv6 addresses simultaneously, allowing them to accept connections from both protocols.

Dual-Stack Service Overview

You can now deploy cluster nodes in a dual-stack configuration. Client nodes support IPv4-only, IPv6-only, and dual-stack setups.

The following components are Cloudera-certified for dual-stack configurations:
  • HBase

  • Cloudera Data Explorer (Hue)

  • Hive (Hive on Tez)

  • Impala

  • Kafka

  • Kudu

  • Phoenix (including Phoenix Query Server)

  • ZooKeeper

Other Runtime components operate correctly in a dual-stack environment and accept IPv4 connections. However, Cloudera Manager does not allow you to control whether these services listen in IPv4-only mode or dual-stack mode. In dual-stack environments, you must configure Knox to operate in IPv4-only mode to prevent unintended IPv6 communication. For more information, see Configuring Knox for IPv4-only operation.

These components include a new configuration parameter: IP_VERSION. You can set this to one of two values:
  • IPV4: The service accepts only IPv4 connections and uses IPv4 for all outgoing communication.
  • DUAL_STACK: The service accepts both IPv4 and IPv6 connections. While the service does not restrict outgoing communication, components prefer IPv4 by default.

Limitations

This sections outlines the networking limitations for Cloudera Manager and third-party services in IPv6 environments and provides a workaround for connection failures. It specifically addresses why HBase, Beeline (Hive), Phoenix, ZooKeeper, Streams Messaging Manager, Streams Replication Manager, Schema Registry, and Cruise Control clients might fail on IPv6-only nodes and explains how to resolve these issues by updating JVM environment variables to prioritize IPv6 traffic.
  • Management Services: The Cloudera Manager Server and Agents remain IPv4-only. These services continue to listen and communicate strictly through IPv4.
  • Third-Party Services: You must maintain third-party services on IPv4-only nodes.
  • Client Configuration: Using IPv6-only nodes for clients might require additional configuration depending on the component. While setting IP_VERSION to DUAL_STACK is the primary requirement, certain components might need further manual adjustments.
  • Clients Fail to Connect from IPv6-only Nodes
    In environments that use IPv6 exclusively, HBase, Beeline (Hive), Phoenix, ZooKeeper, Streams Messaging Manager, Streams Replication Manager, Schema Registry, and Cruise Control clients fail to establish connections. This issue occurs because default Java networking settings prioritize IPv4 or lack an explicit preference for IPv6 addresses. Consequently, connection attempts time out or fail.
    When a client tries to reach service components from a node that only supports IPv6, it cannot resolve or access the target services. To resolve this, you must configure the JVM to prefer IPv6 addresses and disable the IPv4-only stack preference.
    Workaround: Enable IPv6 Support for Clients
    To allow successful connections from an IPv6-only node, update the environment variables to include specific Java options.
    Update Environment Variables
    Add the following configuration flags to your client environment or shell profile (such as .bashrc, hadoop-env.sh, hbase-env.sh, hive-env.sh, or zkEnv.sh) to ensure the clients use the correct networking stack:
    • Flags to include:
       -Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true
    Implementation by Service
    Apply the flags to the corresponding environment variable for your specific client:
    Client Environment Variable
    HBase
    export HBASE_OPTS="-Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true $HBASE_OPTS"
    Phoenix
    export PHOENIX_OPTS="-Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true $PHOENIX_OPTS"
    ZooKeeper
    export CLIENT_JVMFLAGS="-Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true $CLIENT_JVMFLAGS"
    Beeline (Hive)
    export HADOOP_CLIENT_OPTS="-Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true $HADOOP_CLIENT_OPTS"
    Streams Messaging Manager
    export SMM_JVM_PERF_OPTS="-Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true"
    Streams Replication Manager
    export SRM_OPTS="-Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true $SRM_OPTS"
    Schema Registry
    export REGISTRY_OPTS="-Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true $REGISTRY_OPTS"
    Cruise Control
    export KAFKA_OPTS="-Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true $KAFKA_OPTS"
    Apply the Changes
    1. Save the file: Save your changes to the configuration or profile script.
    2. Source the file: Execute the source command (for example, source ~/.bashrc) to load the new settings into your current session.
    3. Restart the application: Restart the client application or CLI tool to ensure it picks up the updated JVM parameters.