Profilers
Use profilers to launch and manage profiling jobs, run on-demand profiling, and track job status over time. In VM-based environments you can also back up the profiler database and profile data in non-default buckets. Configure each profiler type to fine-tune schedules, resource usage, tag rules, and profiling behavior for your environment.
Profilers in VM Based Environments
Launching profilers
Launch profiler clusters from the Cloudera Data Catalog user interface in VM-based environments.
Launching profilers using the command-line
Launch profilers with the Cloudera Data Catalog CLI, including required parameters and usage.
Enabling or disabling profilers
Turn profiler scheduling on or off and control when profiling runs for your assets.
Tracking profiler jobs
Monitor profiler jobs, statuses, and execution details to troubleshoot issues and review profiling history.
Viewing profiler configurations
Review active profiler configuration settings from the Cloudera Data Catalog interface.
Configuring the Ranger Audit Profiler
Configure scheduling and resource settings for the Ranger Audit Profiler beyond the generic configuration.
Configuring the Cluster Sensitivity Profiler
Configure scheduling and resources for the Cluster Sensitivity Profiler beyond the generic settings.
Setting up column name based tagging
Set up column name based tagging for Cluster Sensitivity Profiler workflows in supported environments.
Profiler tag rules in VM-based environments
Define and manage profiler tag rules used with the Cluster Sensitivity Profiler in VM-based environments.
Creating tag rules in VM-based environments
Create tag rules that drive how classifications are applied for the Cluster Sensitivity Profiler.
Configuring the Hive Column Profiler
Configure optional settings for the Hive Column Profiler beyond the generic configuration.
Understanding the Cron Expression generator
Build and validate cron expressions used to schedule profiler jobs.
On-Demand Profilers
Profile specific assets on demand without relying on cron-based scheduling.
Deleting profilers
Delete profiler jobs and clusters, and understand the impact on related data and Kubernetes objects.
Profilers in Compute Cluster Enabled Environments
Launching profilers
Launch profiler clusters from the Cloudera Data Catalog user interface in Compute Cluster enabled environments.
Launching profilers using the command-line
Launch profilers with the Cloudera Data Catalog CLI, including required parameters and usage.
Enabling or disabling profilers
Turn profiler scheduling on or off and control when profiling runs for your assets.
Tracking profiler jobs
Monitor profiler jobs, statuses, and execution details to troubleshoot issues and review profiling history.
Viewing profiler configurations
Review active profiler configuration settings from the Cloudera Data Catalog interface.
Configuring the Activity Profiler
Configure scheduling and resource settings for the Activity Profiler.
Configuring the Data Compliance Profiler
Configure scheduling, incremental profiling, and resource settings for the Data Compliance Profiler.
Profiler tag rules in Compute Cluster enabled environments
Define and manage profiler tag rules used with the Data Compliance Profiler in Compute Cluster enabled environments.
Creating tag rules in compute cluster environments
Create tag rules that drive how classifications are applied for the Data Compliance Profiler.
Approving Data Compliance Profiler tags
Review and approve tags suggested by the Data Compliance Profiler before they are applied to assets.
Configuring the Statistics Collector Profiler
Configure scheduling, incremental profiling, and resource allocation for the Statistics Collector Profiler.
Understanding the Cron Expression generator
Build and validate cron expressions used to schedule profiler jobs.
On-Demand Profilers
Profile specific assets on demand without relying on cron-based scheduling.
Deleting profilers
Delete profiler jobs and clusters, and understand the impact on related data and Kubernetes objects.
