Launching profilers using the command-line
Cloudera Data Catalog supports launching profilers using the Command-Line Interface (CLI) option.
The CLI is one executable and does not have any external dependencies. You can execute some operations in the Cloudera Data Catalog service using the Cloudera CLI commands.
Users must have valid permissions to launch profilers on a data lake.
For more information about the access details, see Prerequisites to access Cloudera Data Catalog.
Prerequisites
You must have the following entitlement granted to use this feature:
DATA_CATALOG_ENABLE_API_SERVICE
For more information about the Cloudera command-line interface and setting up the same, see Cloudera CLI.
The Cloudera Data Catalog CLI
In your Cloudera CLI environment, enter the following command to get started in the CLI mode.
cdp datacatalog --help
This command provides information about the available commands in Cloudera Data Catalog for Cloudera on cloud 7.2.18. and earlier versions.
NAME
datacatalog
DESCRIPTION
Cloudera Data Catalog Service is a web service, using this service user can execute operations like launching profilers in Data Catalog.
AVAILABLE SUBCOMMANDS
launch-profilers
Parameters for profiler launch command
You get additional information about this command by using:
cdp datacatalog launch-profilers --help
NAME
launch-profilers -
DESCRIPTION
Launches DataCatalog profilers in a given datalake.
NAME
launch-profilers - Launches DataCatalog profilers in a given datalake.
DESCRIPTION
Launches DataCatalog profilers in a given datalake.
SYNOPSIS
launch-profilers
--datalake <value>
[--enable-ha | --no-enable-ha]
[--cli-input-json <value>]
[--generate-cli-skeleton]
OPTIONS
--datalake (string)
The CRN of the Datalake.
--enable-ha | --no-enable-ha (boolean)
Enables High Availability (HA) for datacatalog profilers (default
value is false). The High Availability (HA) Profiler cluster
provides failure resilience and scalability but incurs additional
cost.
--cli-input-json (string)
Performs service operation based on the JSON string provided. The
JSON string follows the format provided by --generate-cli-skeleton.
If other arguments are provided on the command line, the CLI values
will override the JSON-provided values.
--generate-cli-skeleton (boolean)
Prints a sample input JSON to standard output. Note the specified
operation is not run if this argument is specified. The sample input
can be used as an argument for --cli-input-json.
OUTPUT
success -> (boolean)
Status of the profiler launch operation.
datahubCluster -> (object)
Information about a cluster.
clusterName -> (string)
The name of the cluster.
crn -> (string)
The CRN of the cluster.
creationDate -> (datetime)
The date when the cluster was created.
clusterStatus -> (string)
The status of the cluster.
nodeCount -> (integer)
The cluster node count.
workloadType -> (string)
The workload type for the cluster.
cloudPlatform -> (string)
The cloud platform.
imageDetails -> (object)
The details of the image used for cluster instances.
name -> (string)
The name of the image used for cluster instances.
id -> (string)
The ID of the image used for cluster instances. This is
internally generated by the cloud provider to uniquely
identify the image.
catalogUrl -> (string)
The image catalog URL.
catalogName -> (string)
The image catalog name.
environmentCrn -> (string)
The CRN of the environment.
credentialCrn -> (string)
The CRN of the credential.
datalakeCrn -> (string)
The CRN of the attached datalake.
clusterTemplateCrn -> (string)
The CRN of the cluster template used for the cluster creation.
FORM FACTORS
public
Parameters for profiler delete command
You get additional information about this command by using:
cdp datacatalog delete-profiler --help
NAME
delete-profiler - Deletes DataCatalog profiler in a given datalake.
DESCRIPTION
Deletes DataCatalog profiler in a given datalake.
SYNOPSIS
delete-profiler
--datalake <value>
[--cli-input-json <value>]
[--generate-cli-skeleton]
OPTIONS
--datalake (string)
The CRN of the Datalake.
--cli-input-json (string)
Performs service operation based on the JSON string provided. The
JSON string follows the format provided by --generate-cli-skeleton.
If other arguments are provided on the command line, the CLI values
will override the JSON-provided values.
--generate-cli-skeleton (boolean)
Prints a sample input JSON to standard output. Note the specified
operation is not run if this argument is specified. The sample input
can be used as an argument for --cli-input-json.
OUTPUT
FORM FACTORS
public
Launching the profiler
You can use the following CLI command to launch the data profiler:
cdp datacatalog launch-profilers --datalake [***DATALAKE CRN***]
Example:
cdp datacatalog launch-profilers --datalake crn:cdp:datalake:datacentername:c*****b-ccce-4**d-a**1-8********8:datalake:4*****5e-c**1-4**2-8**e-1********2
{
"success": true
}