Repair FreeIPA

When running in high-availability mode, the identity management system runs multiple instances of FreeIPA on separate hosts. In case of failure, you can repair failed hosts using the CDP CLI within one week of a node failing.

The repair process for FreeIPA hosts performs the following steps, stopping when a healthy status returns:
  • Start any FreeIPA hosts that are stopped.
  • Reboot FreeIPA hosts. (same physical host, same public DNS name, private IP address, and associated storage).
  • Restore identity management data from backup.
  • Stop and restart FreeIPA hosts (rebuild using new hardware).

This procedure uses the CDP CLI. If you haven't already installed the CLI, see Installing the Cloudera client for instructions.

On Azure, before you run the repair command, make sure that the resource group has neither a DELETE nor a READ-ONLY lock applied.

Steps

Run the FreeIPA repair command. Run this command from a computer that has network access to the FreeIPA hosts.

cdp environments repair-freeipa --environment-name <value>
                  [--force | --no-force]
                  [--instances <value>]
                  [--cli-input-json <value>]
                  [--generate-cli-skeleton]
                  [--repair-type <string>]

where the options are the following:

Option Description
--environment-name <value> Specifies the FreeIPA's environment name or CRN. The environment CRN is listed in Environment > Summary > General.
--force | --no-force Choose to force the repair even if the status if the FreeIPA nodes are good. If not specified, defaults to no-force.
--instances <value> Specifies the instance IDs to repair. Use a space to separate multiple instance IDs. If no IDs are provided then all instances are considered for repair. The FreeIPA instance IDs are listed in Environment > Summary > FreeIPA.

You can get the IDs from the output of the cdp environments get-freeipa-status command.

--cli-input-json <value> Performs the operations indicated in the command provided in JSON format. Call generate-cli-skeleton to see a template of the required JSON content. If other arguments are provided on the command line, the command line values override the JSON-provided values.
--generated-cli-skeleton Displays an example of the format of the input JSON. The repair command does not run.
--repair-type <string> The type of FreeIPA repair to perform. Possible values:
  • AUTO - Currently, this is the same as reboot but this may change in the future.
  • REBOOT - Repair the failed instances by rebooting them.
  • REBUILD - Repair the failed instances by deleting them and creating new instances, then replicate data from an existing instance to the new instances. Needs powerUser role.
For example, the following command repairs two instances without forcing the repair:
$ cdp environment repair-freeipa --environment-name crn:cdp:environments:us-west-1:12a0079b-1591-dd33-b721-a446bda74e67:environment:36853fcc-2fef-4094-834c-557b4aea34ee --instances i-078ba50f9feb6638f i-09e8b54a343b33d2 
cdp environments repair-freeipa --environment-name john-doe-env2-25793 --instances i-0288d991ed998ec03 --force
{
    "operationId": "edda2c68-5a29-4f60-a150-aca963b36ead",
    "status": "RUNNING",
    "successfulOperationDetails": [],
    "failureOperationDetails": [],
    "startDate": "2020-10-01T19:48:36.009000+00:00"
}

If you see the following error, consider rerunning the repair command with the --force option:

An error occurred: {"message":"No unhealthy instances to reboot. Maybe use the force option."} (Status Code: 404; Error Code: NOT_FOUND; Service: environments; Operation: repairFreeipa; Request ID: eca3a7fb-aa6b-48f7-ae29-b881161869e5;)

Check FreeIPA repair status

You can check the status of an in-progress repair operation with get-repair-freeipa-status.

This procedure uses the CDP CLI. If you haven't already installed the CLI, see Installing the Cloudera client for instructions.

Steps

Run the FreeIPA status command. Run this command from a computer that has network access to the FreeIPA hosts.
cdp environments get-repair-freeipa-status --operation-id <value>
          [--cli-input-json <value>]
          [--generate-cli-skeleton]
where the options are the following:
Option Description
--operation-id <value> Operation-id for the previously requested repair operation.
--cli-input-json <value> Performs service operation based on the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton. If other arguments are provided on the command line, the CLI values will override the JSON-provided values.
--generate-cli-skeleton Prints a sample input JSON to standard output. Note the specified operation is not run if this argument is specified. The sample input can be used as an argument for --cli-input-json.
This command can take 15 to 45 seconds to run as it gathers information in real-time. The output of the status command provides the status for the repair operation (in JSON format).
cdp environments get-repair-freeipa-status --operation-id edda2c68-5a29-4f60-a150-aca963b36ead
{
    "status": "COMPLETED",
    "successfulOperationDetails": [
        {
            "environmentCrn": "crn:cdp:environments:us-west-1:9d74eee4-1cad-45d7-b645-7ccf9edbb73d:environment:08c55413-6e2b-4664-8367-ef3fc0787773"
        }
    ],
    "failureOperationDetails": [],
    "startDate": "2020-10-01T19:48:36.009000+00:00",
    "endDate": "2020-10-01T19:49:08.392000+00:00"
    }
Element Data Type Description
status string Status of a repair operation. Possible values: REQUESTED, RUNNING, COMPLETED, FAILED, REJECTED, TIMEDOUT.
successfulOperationDetails array List of operation details for all successes. If the repair is only partially successful both successful and failure operation details will be populated.
failureOperationDetails array List of operation details for failures. If the repair is only partially successful both successful and failure operation details will be populated.
error string If there is any error associated. The error will be populated on any error and it may be populated when the operation failure details are empty. The error will typically contain the high level information such as the associated repair failure phase.
startDate datetime Date when the operation started.
endDate datetime Date when the operation ended. Omitted if operation has not ended.