Managing FreeIPA

FreeIPA is the backbone of the CDP Identity Management functionality. After you configure a CDP environment, FreeIPA works to provide user identities without the need for your attention. In case of problems, you may need to perform troubleshooting to ensure the health of the identity management system.

By default, FreeIPA instance consists of a single VM, which makes it a point of potential failure for a CDP environment. You can choose to configure your CDP environment to run FreeIPA in high-availability mode, instantiating multiple instances of FreeIPA on separate hosts. When configured, this mode allows automatic failover should one FreeIPA instance fail and a scripted manual process to recover the system with no downtime should it fail.

The CDP environment backs up the FreeIPA state periodically (by default, hourly). The backup data is stored on an attached volume (AWS) or managed disk (Azure). This backup allows the state to be recovered in the event of a failure. Without HA mode enabled, recovering from a FreeIPA failure requires a recovery process that is facilitated by Cloudera technical support.

The host and status of the FreeIPA instance is displayed in the environment's Summary tab in the Management Console.

FreeIPA HA

From the CLI, you can choose to create 1+ (up to a maximum of 3) FreeIPA instances, and the system automatically configures FreeIPA in HA mode. The FreeIPA instances are on separate hosts within the same availability zone.

The system creates multiple FreeIPA instances and replicates identity management data across all of them. Should there be a conflict synchronizing across instances, the system maintains the "last in" content. If one of the FreeIPA instances fails to pass the environment's status checks, the overall status for FreeIPA turns gray. The FreeIPA clients switch to another FreeIPA instance and the system remains functional. After a week in this state, the identity management system may start to fail from certificates expiring and other problems.

The host and status of the FreeIPA instance is displayed in the environment's Summary tab in the Management Console.

The number of nodes used depends on the Data Lake scale selected. From the UI, Light duty will use 2 nodes and Medium duty will use 3. Similarly, the CLI will default to using 2 nodes. If you wish to avoid the extra cost of the additional node, you can specify to create the environment with a single node:
cdp environments create-aws-environment
[...]
--free-ipa instanceCountByGroup=1
When you see a status other than "Running", follow these steps to investigate and resolve problems:
  1. Determine which FreeIPA instance or instances need attention.

    To retrieve a detailed status, use the CDP command-line interface. See Show FreeIPA instance status.

  2. Repair one or more FreeIPA instances.

    The CDP command-line interface includes a command to trigger a FreeIPA check and reboot repair process. The repair command should resolve most problems with the identity management system. For example, it checks to see if FreeIPA hosts are stopped and restarts them. If hosts are running, the repair process will restart them.

    The repair command may cause cluster outage:
    • If at least one instance is running, the repair command can avoid an outage.
    • If all instances have failed, the repair command won't be able to avoid an outage.
    • If the repair command is run with its "force" option against all instances, the command will cause an outage.

    If the FreeIPA status for the environment returns to "Running", you can stop here.

  3. Rebuild the identity management system.

    When repairing a FreeIPA instance, the CDP command-line interface includes a command to trigger a rebuild process that destroys and rebuilds N-1 nodes in the identity management system. This process restores content from the most recent backup and does not require cluster downtime.

    If all nodes need to be repaired or if the unsupervised rebuild process fails, Cloudera Technical Support can help you perform a rebuild of the identity management system and restore content from a backup. This process will require cluster downtime.

FreeIPA failure scenarios

Because FreeIPA is a background system, you are not likely to encounter any failures that include a specific reference to FreeIPA in the error text. Instead, problems with FreeIPA show up as DNS problems, user login problems that raise Kerberos errors, and authentication errors when provisioning workload clusters. If you encounter these general errors, consider checking the status of the FreeIPA system.