Cloudera Lakehouse Optimizer REST APIs

Cloudera Lakehouse Optimizer provides REST APIs that you can use to create, manage, and monitor the Cloudera Lakehouse Optimizer policies. You can also use REST APIs to onboard namespaces, pause and resume maintenance, perform manual table maintenance, view task status and metadata, and fetch the details of different policy-related operations.

You can use the following REST APIs, when necessary, to perform the maintenance tasks:

REST API Required parameters Role required Description
config
GET /config/backup
name
string
(query)
  • Administrator
Creates a backup of all the policies and association files with the resource entries, and saves it in the specified file in tar.gz format.
GET /config/health
scope
string
(query)
  • Administrator
  • Operator
  • Monitor
Fetches the full health state of the Cloudera Lakehouse Optimizer components when the scope query parameter is set to full.

By default, the API fetches partial health state.

POST /config/reconfigure -
  • Administrator
Fetches the latest policies, and then restarts or reconfigures the Cloudera Lakehouse Optimizer service.
POST /config/restore
simulate
boolean
(query)
  • Administrator
Restores the backup from the tar.gz file in the /backup location when the simulate query parameter is set to true.
namespaces
GET /namespaces/{namespace}/policies
namespace
string
(path)
  • Administrator
  • Operator
  • Monitor
Enter the required namespace name. The API fetches the policy names associated with at least one of the tables in the specified namespace.
GET /namespaces/active -
  • Administrator
  • Operator
  • Monitor
Fetches all the active namespaces maintained by Cloudera Lakehouse Optimizer.
GET /namespaces
fetch
boolean
(query)
  • Administrator
  • Operator
  • Monitor
Fetches all the namespaces in the catalog when the fetch query parameter is set to true. Otherwise, the API fetches only the declared namespaces that are maintained by Cloudera Lakehouse Optimizer.
GET /namespaces/{namespace}
namespace
string
(path)
  • Administrator
  • Operator
  • Monitor
Enter the required namespace name. The API fetches the details about the specified namespace.
PUT /namespaces/{namespace}
namespace
string
(path)
Administrator Enter the required namespace name. The API adds or declares the specified namespace to the list of maintainable namespaces for Cloudera Lakehouse Optimizer.
DELETE /namespaces/{namespace}
namespace
string
(path)
Administrator Enter the required namespace name. The API removes the specified namespace from the list of maintainable namespaces for Cloudera Lakehouse Optimizer.
PATCH /namespaces/{namespace}
namespace
string
(path)
  • Administrator
  • Operator
Enter the required namespace name. The API updates the policies for all the tables in the specified namespace, and reschedules the policies.
PUT /namespaces/{namespace}/paused
namespace
string
(path)
  • Administrator
  • Operator
Enter the required namespace name. The API pauses the maintenance for the specified namespace. Cloudera Lakehouse Optimizer removes the specified namespace from the list of maintainable namespaces.
DELETE /namespaces/{namespace}/paused
namespace
string
(path)
  • Administrator
  • Operator
Enter the required namespace name. The API resumes the maintenance for the specified namespace. Cloudera Lakehouse Optimizer adds the specified namespace to the list of maintainable namespaces.
policies
POST /policies/{policyname}/association
policyName
string
(path)
  • Administrator
  • Operator
Enter the policy name. The API creates a policy association to the entire catalog, entire namespace, or to a particular table. The association is provided as a list of dot-separated strings to specify the hierarchy.

For example, when you enter {"associations": [a, b.c, d.e.f]} in the request body, the policy is applicable to the entire catalog a, entire namespace b of catalog a, and table d of namespace e in catalog d.

GET /policies/{policyName}/scope/{resourceScope}/def
policyName
string
(path)

resourceScope
string
(path)

solve
string
(query)

base64
boolean
(query)
  • Administrator
  • Operator
Enter the policy name, and enter the resource scope as catalog, namespace, or table. The API fetches the policy definition for the specified policy in the specified resource scope.
PUT /policies/{policyName}/scope/{resourceScope}/def
policyName
string
(path)

resourceScope
string
(path)

base64
boolean
(query)
Administrator Enter the policy name, and enter the resource scope as catalog, namespace, or table. The API updates the policy definition for the specified policy in the specified resource scope.
POST /policies/{policyName}/scope/{resourceScope}/def
policyName
string
(path)

resourceScope
string
(path)

base64
boolean
(query)
Administrator Enter the policy name, and enter the resource scope as catalog, namespace, or table. The API creates a policy definition for the specified resource.
DELETE /policies/{policyName}/scope/{resourceScope}/def
policyName
string
(path)

resourceScope
string
(path)
Administrator Enter the policy name, and enter the resource scope as catalog, namespace, or table. The API deletes the specified policy definition for the specified resource.
GET /policies/resource
uri
string
(query)
  • Administrator
  • Operator
  • Monitor
Enter the URI query parameter. The API resolves the specified URI and fetches the policy script.

For example, tpp is the authority for JSON file or policy constants. The dlm://tpp:default/daily URI fetches the daily.json file from the default level. The dlm://tpp/cat0/daily, dlm://tpp/cat0/ns0/daily, and dlm://tpp/cat0/ns0/tb0/daily URIs fetch the JSON file from catalog, namespace, and table level respectively.

tps is the authority for JEXL file or policy script. The dlm://tps:default/daily URI fetches the daily.jexl file defined at default level.

Similarly, the dlm://tps/cat0/daily, dlm://tps/cat0/ns0/daily, and dlm://tps/cat0/ns0/tb0/daily URIs fetch the JEXL file from catalog, namespace, and table level respectively.

PUT /policies/resource
uri
string
(query)

base64
string
(query)
Administrator Enter the URI query parameter. The API creates the policy resources based on the specified URIs.

To create a single resource, upload the resource file as a multi-part form data with the field name as resource and the URI in the query parameter.

To create multiple resources, upload multiple files as a multi-part form data along with the URI in the field name.

DELETE /policies/resource
uri
string
(query)
Administrator Deletes the policy resources, and returns the specified policy script or property file.
PATCH /policies/resource
uri
string
(query)

base64
string
(path)
Administrator Enter the URI query parameter. The API creates the policy resources based on the specified URI.

To create a single resource, upload the resource file as a multi-part form data with the field name as resource and the URI in the query parameter.

To create multiple resources, upload multiple files as a multi-part form data along with the URI in the field name.

PUT /policies/resources
base64
string
(query)
Administrator Creates the policy resources.

To create a single resource, upload the resource file as a multi-part form data with the field name as resource and the URI in the query parameter.

To create multiple resources, upload multiple files as a multi-part form data along with the URI in the field name.

PATCH /policies/resources
base64
string
(query)
Administrator Updates the policy resources.

To update a single resource, upload the resource file as a multi-part form data with the field name as resource and the URI in the query parameter.

To update multiple resources, upload multiple files as a multi-part form data along with the URI in the field name.

PUT /policies/{policyName}/tables/{tableName}/subs
policyName
string
(path)

tableName
string
(path)
  • Administrator
  • Operator
Enter the policy name and the table name to add or associate the specified policy to the specified table in the association file.
POST /policies/{policyName}/tables/{tableName}/subs
policyName
string
(path)

tableName
string
(path)
  • Administrator
  • Operator
Enter the policy name and the table name to assign or associate the specified policy to the specified table in the association file.
DELETE /policies/{policyName}/tables/{tableName}/subs
policyName
string
(path)

tableName
string
(path)
  • Administrator
  • Operator
Enter the policy name and the table name to unassign or remove the association of the specified policy for the specified table.
PUT /policies/{policyName}/tables/{tableName}/dryrun
policyName
string
(path)

tableName
string
(path)

needStatistics
boolean
(query)
  • Administrator
  • Operator
Enter the policy name and the table name to evaluate and only generate the maintenance actions for the specified policy name and specified table name. The API only generates the maintenance actions and does not execute it. This action allows you to verify whether the generated maintenance actions are as expected.

Set needStatics to true to view the statistics for the maintenance actions.

POST /policies/{policyName}/tables/{tableName}/evaluation
policyName
string
(path)

tableName
string
(path)
  • Administrator
  • Operator
Enter the policy name and the table name to schedule and evaluate the specified policy on the specified table. During this process, Cloudera Lakehouse Optimizer evaluates, generates the maintenance actions, sends the maintenance actions to Livy for the Spark engine to run the maintenance actions on the Iceberg table in the order specified in the JEXL file.
GET /policies/active
namespace
string
(query)
  • Administrator
  • Operator
  • Monitor
Fetches the list of policies that are associated with at least one table in the specified namespace.
GET /policies/{policyName}/tables
policyName
string
(path)

namespace
string
(query)
  • Administrator
  • Operator
  • Monitor
Enter the policy name to fetch the tables associated with it. You can also enter the policy name and the namespace to fetch the table names associated with the specified policy in the specified namespace.
GET /policies/paused
namespace
string
(query)
  • Administrator
  • Operator
  • Monitor
Enter the namespace to fetch the list of all the policies that are in the paused list for the specified namespace.
DELETE /policies/paused -
  • Administrator
  • Operator
Removes all the policies from the paused list.

Cloudera Lakehouse Optimizer initiates the evaluation phase for these tables depending on the schedule in the policy.

GET /policies/{policyVersion}/paused
policyVersion
string
(path)
  • Administrator
  • Operator
  • Monitor
Enter the policy id to fetch the reason for placing the policy in the paused list.

For example, the exception that appeared when the policy failed.

DELETE /policies/{policyVersion}/paused
policyVersion
string
(path)
  • Administrator
  • Operator
Enter the policy id to remove the specified policy from the paused list.

Cloudera Lakehouse Optimizer initiates the evaluation phase for the tables depending on the schedule in the policy.

GET /policies/{policyName}/tables/{tableName}/desc
policyName
string
(path)

tableName
string
(path)
  • Administrator
  • Operator
  • Monitor
Enter the policy name and table name to fetch the description for the specified policy and table.
POST /policies/{policyName}/catalogs/{catalogName}/subs
policyName
string
(path)

catalogName
string
(path)
  • Administrator
  • Operator
Enter the policy name and table name to assign the specified policy to the specified catalog. This action assigns the policy to all the tables in the catalog across all namespaces.
DELETE /policies/{policyName}/catalogs/{catalogName}/subs
policyName
string
(path)

catalogName
string
(path)
  • Administrator
  • Operator
Enter the policy name and table name to unassign or remove the association of the specified policy to the specified catalog. The API unassigns the policy for all the tables in the catalog across all namespaces.
tables
GET /tables/{tableId}/paused
tableId
string
(path)
  • Administrator
  • Operator
  • Monitor
Enter the table ID to fetch the details about the specified table in the paused list, and the reason for placing it in the paused list.
PUT /tables/{tableId}/paused
tableId
string
(path)
  • Administrator
  • Operator
Enter the table ID to add the specified table to the paused list.
DELETE /tables/{tableId}/paused
tableId
string
(path)
  • Administrator
  • Operator
Enter the table ID to remove the specified table from the paused list.

Cloudera Lakehouse Optimizer initiates the evaluation phase for the table depending on the schedule set in the policy.

GET /tables/{tableId}/policies
tableId
string
(path)
  • Administrator
  • Operator
  • Monitor
Enter the table ID to fetch the policy names associated with the specified table.
GET /tables/{tableId}/stats
tableId
string
(path)

force
boolean
(query)
  • Administrator
  • Operator
  • Monitor
Enter the table ID to fetch the statistics associated with the specified table. Optionally, you can use query parameters to override the thresholds.

You can also specify the namespace and the table ID in the GET /tables/{tableId}/stats API. For example, GET /tables/hive.default.t1/stats/.

If the specified namespace is not available for Cloudera Lakehouse Optimizer, the table is retrieved from the default namespace.

GET /tables/paused -
  • Administrator
  • Operator
  • Monitor
Fetches the list of tables in the paused list.
DELETE /tables/paused -
  • Administrator
  • Operator
Removes all the tables from the paused list.

Cloudera Lakehouse Optimizer initiates the evaluation phase for the tables depending on the schedule set in the policy.

task details
GET /tasks/ingestlog -
  • Administrator
  • Operator
  • Monitor
Fetches the event log ingestion metadata.
POST /tasks/ingestlog -
  • Administrator
  • Operator
Triggers the Livy job to insert the event logs into the sys.task_events Iceberg table.
GET /tasks/id/{id}
id
string
(path)
  • Administrator
  • Operator
  • Monitor
Enter the task ID to fetch the task metadata for the specified task ID.
GET /tasks
policy
string
(query)

table
string
(query)

status
string
(query)

sortBy
string
(query)

count
string
(query)
  • Administrator
  • Operator
  • Monitor
Fetches the list of recent 100 tasks. You can override this value using the Cloudera Manager > Clusters > cloudera_lakehouse_optimizer > Configuration > dlm.task.metadata.files.limit advanced configuration snippet. To view the older tasks, query the sys.task_events Iceberg table.

Optionally, you can fetch the details based on:

  • the policy name,
  • the table name,
  • the task status where the available status values include INIT, SUBMITTED, COMPLETED, or FAILED,
  • the sortBy results in ASC or DESC order, where DESC is the default, and
  • the count to specify the maximum number of tasks to fetch, where 10 is the default.
GET /tasks/policies/{policyName}/tables/{tableName}/submitted
policyName
string
(path)

tableName
string
(path)
  • Administrator
  • Operator
  • Monitor
Enter the policy name and table name for a task in the submitted state to fetch its task metadata.