Cloudera Public Cloud Preview Features
The information provided on these pages outlines features currently available in preview. Access to preview features is provided upon request for customers for trial and evaluation. The components are provided “as is”, without warranty or support, and Cloudera assumes no liability for their use. Customers should use preview features at their own risk.
To enable a preview feature in your Cloudera account, contact your account team.
Cloudera Data Engineering
- Using JVM debugger with Apache Spark jobs
-
published: 2022-11-23;
modified: 2022-11-23
Learn how to connect a JVM debugger remotely to Spark jobs (driver/executor). - Using Custom Spark Runtime Docker Images via API/CLI
-
published: 2022-09-06;
modified: 2022-07-31
Learn how to run Spark jobs using custom Spark runtime Docker images via API/CLI. - CDE In-place Upgrades
-
published: 2022-07-20;
modified: 2022-12-06
Learn how Cloudera Data Engineering enables in-place upgrades on AWS and Azure, upporting upgrades from up to two versions prior to the latest release.
Cloudera Data Hub
- Schedule-based Autoscaling for Cloudera Data Hub Clusters Using Impala
-
published: 2023-11-09;
modified: 2024-12-04
Schedule-based autoscaling for Cloudera Data Hub clusters using Impala is a feature that scales the number of nodes in an executor host group up or down based upon a schedule that you define.
Cloudera Data Warehouse
- Using Hive Data Connectors to support External Data Sources
-
published: 2023-11-20;
modified: 2023-11-20
Learn how you can achieve SQL query federation by using Hive data connectors to map databases present in external data sources to a local Hive Metastore. - Deploying a shared Hue service
-
published: 2023-11-20;
modified: 2024-07-26
Learn about the advantages and upgrade limitations of deploying the shared Hue service and some FAQs that can help you understand more about the feature. - Reserving nodes for auto-scaling
-
published: 2022-07-26;
modified: 2022-07-26
Learn how to accelerate Virtual Warehouse startup and autoscaling by configuring buffer nodes to keep compute instances on standby, ready to join new or autoscaled clusters. - Using the Impala AI Function
-
published: 2024-07-26;
modified: 2024-07-26
Learn how you can use Impala's ai_generate_text function to access Large Language Models in SQL queries. This function enables you to input a prompt, retrieve the response, and include it in results. - Using Impala to query external JDBC data sources
-
published: 2024-07-26;
modified: 2024-07-26
Learn how you can use external JDBC tables to connect Impala to a database, such as MySQL, PostgreSQL, or another Impala cluster and read the data in the remote tables. - Impala Workload Management
-
published: 2024-07-26;
modified: 2024-07-30
Learn how to use the Impala query logging capability in Cloudera Data Warehouse to archive essential query profile data into dedicated database tables, enabling consolidated reporting on previously executed queries.
Governance
- Integrating CDP Data Catalog with AWS Glue Data Catalog
-
published: 2021-08-09;
modified: 2021-12-08
While using AWS Glue in Data Catalog, you will be able to experience a complete snapshot metadata view, along with other visible attributes that can power your data governance capabilities. - Navigating to tables and databases in Hue using Data Catalog
-
published: 2021-08-07;
modified: 2021-08-07
The integration between Data Catalog and Cloudera Data Warehouse service provides a direct web link to the Hue instance from the Data Catalog web UI, making it easy to navigate across services. - Support for CDP Private Cloud Base clusters in Data Catalog
-
published: 2022-02-24;
modified: 2022-04-06
Data Catalog now supports discovering and profiling assets that reside in Cloudera Private Cloud Base clusters. - Supporting High Availability for Profiler services
-
published: 2021-08-07;
modified: 2021-08-07
The Data Catalog profiler services is now supported by enabling the High Availability feature. - Transitioning Profiler Manager Service into SDX
-
published: 2022-02-24;
modified: 2022-02-24
The Profiler Manager Service is moved to the SDX infrastructure.
Cloudera AI
- Private Cluster Support
-
published: 2022-01-06;
modified: 2023-07-17
Private Clusters provide a simple way to create a secure cluster, where the API server and the workloads themselves only rely on private IP addresses that are not accessible from the internet. - CMK Encryption on AWS
-
published: 2021-08-10;
modified: 2022-08-10
Cloudera Machine Learning on AWS is now able to use a Customer Master Key to encrypt data. - Retry Workspace Installation
-
published: 2023-04-26;
modified: 2023-04-26
When Workspace Provisioning experiences a problem, it is easy to restart the provisioning process from the point where it stopped.
Cloudera Management Console
- Horizontal scaling for the Data Lake
-
published: 2024-03-22;
modified: 2024-12-04
An enterprise Data Lake can be scaled horizontally, meaning that you can add additional instances to dedicated host groups for some services. - Disk Vertical Scaling — Disk Type Change and Resizing in Azure
-
published: 2023-12-12;
modified: 2024-12-04
The standard magnetic storage disks attached to Data Lake and Data Hub clusters can be changed or resized without downtime. - GCS Fine-Grained Access Control
-
published: 2023-09-23;
modified: 2024-12-04
Register a GCP environment with Ranger Authorization Service enabled to allow Google Cloud Storage users to use fine-grained access policies and audit capabilities available in Apache Ranger. - Cluster Orchestrator Component Password Rotation
-
published: 2023-03-02;
modified: 2024-12-04
If required, you can use the CDP CLI to manually rotate the cluster orchestrator component password. - Disabling S3Guard in an Existing Cloudera Environment
-
published: 2022-10-05;
modified: 2024-12-04
You may need to disable S3Guard when upgrading your Data Lakes or Cloudera Data Hub clusters. Use the Beta CDP CLI to disable S3Guard in an existing Cloudera environment. - Azure VM Encryption at Host
-
published: 2022-06-06;
modified: 2024-12-04
You can optionally enable encryption at host for Data Lake, FreeIPA, and Cloudera Data Hub clusters. Currently, you need to enable it individually for each Virtual Machine on Azure Portal. - New UI for adding a Cloudera Private Cloud Base cluster
-
published: 2022-03-29;
modified: 2024-12-04
Register a Cloudera Private Cloud Base cluster as a classic cluster using Cloudera Manager and Knox endpoints so that you can use this cluster in Cloudera Replication Manager and Cloudera Data Catalog services.
Cloudera Replication Manager
- Snapshot Policies in Replication Manager
-
published: 2022-02-25;
modified: 2022-02-25
You can create HDFS and HBase snapshot policies in Replication Manager to schedule taking snapshots of snapshottable HDFS directories and HBase tables at regular intervals. An HDFS directory is snapshottable after it has been enabled for snapshots, or because a parent directory is enabled for snapshots in Cloudera Manager.
Cloudera Operational Database
- HBase Time-based Data Tiering using Persistent BucketCache
-
published: 2024-10-01;
modified: 2024-10-09
Learn how you can configure time-based priority caching for HBase, and also define a specific time range for cache eviction policy.