CDP Public Cloud: February 2021 Release Summary
Cloudera is pleased to announce the latest updates to CDP Public Cloud.
Highlights
- Operational Database is now Generally Available on AWS and Azure
- Data Warehouse now has a Reduced Permissions Mode on AWS, along with significant improvements for performance, upgrades and access control
- Machine Learning now includes Applied ML Prototypes, which provide end-to-end ML use cases that can be deployed with one click
- Data Visualization includes improvements to usability, job scheduling and customization
- Management Console improvements include interactive login mode for CLI, User Delete and Log Anonymization
- SDX improves support for Edge2AI Lineage
- [Tech Preview] Data Catalog adds a Hybrid Cloud mode, allowing you browse and search metadata in a CDP Private Cloud Base cluster
New Or Updated Capabilities
Documentation
-
Management Console
-
FreeIPA HA: You can configure your CDP environment to run FreeIPA in high-availability mode. See Managing FreeIPA.
-
User Delete: Deleting users and machine users.
-
CDP CLI reference: CDP CLI reference documentation is available at https://cloudera.github.io/cdp-dev-docs/cli-docs/.
-
Interactive login for CDP CLI and CDP SDK: Logging into the CDP CLI/SDK.
-
Anonymization Rules: Defining anonymization rules for CDP logs.
-
Configure lifecycle management for logs on AWS and Azure: To avoid unnecessary costs related to Amazon S3 pr ADLS Gen2 cloud storage, you should create lifecycle management rules for your cloud storage location used by CDP for storing logs so that these logs get deleted once they are no longer useful. See Configure lifecycle management for logs on AWS and Configure lifecycle management for logs on Azure.
-
Consolidated documentation for restricting admin and end user access for CDP services: See Restricting access for CDP services that create their own security groups on AWS and Restricting access for CDP services that create their own security groups on Azure.
-
AWS and Azure requirements documentation was updated to include more requirements related to Data Engineering, Data Warehouse, and Machine Learning. These requirements were previously only documented in service-specific docs.
-
-
Data Warehouse
-
Data Engineering
- Expanded CDE CLI how-to with examples
Operational Database
-
Cloudera Operational Database (COD) is now Generally Available on AWS and Azure GA on AWS & Azure
-
Cloudera Operational Database empowers developers to automate and simplify database management with capabilities like auto-scale, auto-heal, and auto-tune.
Data Warehouse
-
Reduced Permission Mode for AWS
-
This feature allows admins to activate CDW environments in AWS using their own user credentials and not need to provide elevated permissions to the cross-account role.
-
When CDW detects that the cross-account role does not have elevated permissions, it shows a “Launch-stack” button in the CDW activation UI.
-
This allows security admins to adhere to principle of least privilege and give cross account role minimum required permissions to manage VWs and DB Catalogs.
-
Data Visualization
-
Job Scheduling and Customization
- Email jobs can now be customized or changed post-creation, making it easy to change the send schedule, add additional recipients, or update the messaging to accompany a dashboard.
-
Usability Improvements
- Improvements and bug fixes for data connections, removal of temporary files, SQL visuals, CSV/Excel downloads and importing visual artifacts.
Machine Learning
-
Applied ML Prototypes
- Applied ML Prototypes (AMPs) are officially launched with brand new Prototypes inside the product, which provide end-to-end projects to help kickstart real customer use cases. The latest AMPs include Deep Learning for Question Answering, Explaining Models with LIME and SHAP, Active Learning, and MLFlow Tracking, totaling to a dozen prototypes made available through our Fast Forward Labs team.
-
Jupyter Notebook Previews
- Notebook files are now able to be viewed without starting a Jupyter Notebook session. This enables Data Scientists to view the results of the latest run of the Notebook very quickly and without having to consume additional compute on the workspace.
Data Catalog
-
[Tech Preview] Hybrid Data Catalog - Browse CDP-PVC Base contents
- Enables the Data Catalog to browse and search metadata in a CDP Private Cloud Base cluster just as if it were another CDP Public Cloud environment.
SDX
-
Edge2AI lineage - Nifi → S3 → ETL → MLOps
- Starting with CDP Runtime 7.2.7, NiFi flows putting files to S3 and ADLSv2 have lineage automatically connected with Hive table based etl automatically and out of the box.
Management Console
-
User Delete
-
CDP administrators now have the ability to delete users in CDP through both the user interface and the CLI.
-
Deleting a user removes all access keys and SSH keys associated with the user, unassigns all roles and resource roles and removes the user from all groups that they belong to.
-
-
Interactive login for CDP CLI and CDP SDK
-
If you would prefer that user access to the CLI/SDK is shorter-lived, you can use the “interactive” method of logging into the CDP CLI/SDK.
-
The interactive method integrates with any SAML-compliant external identity provider
-
-
Anonymization and Redaction of Logs
- CDP includes a set of default anonymization rules and allows you to define custom anonymization rules in order to remove sensitive information from CDP logs
Full release documentation including release notes for each service is available here (Service Name → Release Notes → What’s New).