Disabling global Application restarts

Disable automatic global application restarts in Cloudera AI by using Site Administration settings to manage workbench resource allocation and fault tolerance states.

To improve the resilience and availability of deployed applications, Cloudera AI includes a built-in fault tolerance mechanism that automatically handles unexpected application failures. When an active application enters a failed state, the system attempts to restart the application up to three times. To prevent resource thrashing, the platform enforces a strict five-minute delay between each restart attempt. The recovery loop terminates as soon as the application successfully transitions back to a Running state or when the three-attempt ceiling is reached.

While automatic recovery is enabled by default for all users to ensure high availability, Site Administrators can disable automatic global application restarts for the entire workbench.

  1. In the Cloudera console, click the Cloudera AI tile.

    The Cloudera AI home page is displayed.

  2. Click on the name of the workbench.
    The workbench Home page is displayed.
  3. Click Site Administration in the left navigation pane.
  4. Select the Settings tab.
  5. Scroll down to the application configuration section and clear the Enable Application Restart checkbox.
  6. Click Save Changes.