CDP Public Cloud: July 2020 Release Summary

Cloudera is pleased to announce the latest updates to CDP Public Cloud.

Highlights

  • Cloudera Data Warehouse is now Generally Available on Microsoft Azure
  • [Tech Preview] Ranger Authorization Service (RAZ) for Microsoft Azure ADLS
  • [Tech Preview] Data Discovery and Exploration (SOLR) template for DataHub

New Or Updated Capabilities

Documentation

  • CDW Documentation Updates

    • Adding access to external S3 buckets for CDW clusters on AWS

      • Adding buckets in the same AWS account
      • Adding buckets from a different AWS account

Cloudera Data Warehouse

  • CDW on Azure in now Generally Available

    • CDW is now Generally Available on both AWS and Azure.

    • Cloudera Data Warehouse is a CDP service for self-service creation of independent data warehouses and data marts that autoscale to meet your varying workload demands.

Cloudera Machine Learning

  • Ability to manually select subnets for new ML Workspaces

    • When provisioning a new ML Workspace, you can now select which subnets within your virtual network to be associated with the Workspace. With the differences in requirements between the CDP experiences, this helps to ensure that your planned network is being leveraged appropriately by CML.
  • Customizable Session Metadata

    • The overall user experience for starting a new Session has been streamlined. Setting the Session Name is brought to the forefront, and admins can also define mandatory Session Metadata to be captured for each Session, both of which will help to establish further governance over the Sessions being run.

*** Cloudera Data Flow

  • Refresh of NiFi in Flow Management for Data Hub

    • The Flow Management templates available in CDP Data Hub have been updated with a refreshed version of Apache NiFi 1.11.4. They include a number of dependency upgrades to ensure that NiFi successfully integrates with Cloudera Runtime components.
  • Refresh of Kafka in Streams Messaging for Data Hub

    • The Streams Messaging templates for Runtime 7.2.1 in CDP Data Hub have been updated with Apache Kafka 2.5
  • [Tech Preview] Streaming Analytics for Data Hub Template (AWS only)

    • Streaming Analytics, powered by Apache Flink, is now available on CDP Public Cloud as a DataHub cluster template. This enables powerful analytics on top of streaming data, including Kafka topics on your Streams Messaging Data Hub clusters.

    • This tech preview release ships the lightweight cluster template which is intended for small workloads as well as testing/dev environments

  • [Tech Preview] Data Discovery and Exploration Data Hub Template

    • Make data immediately searchable and explorable in a few simple steps with this new CDP Data Hub template. Significantly simplify the process of building a log or a text search application: Ingest data via the Data Flow template, in real time, or use Spark and Solr to quickly index data; Design index schema in HUE and serve your end users via your custom Search UI Application, via the HUE Search Dashboard, or via Cloudera Visual Apps.

    • If you are in need of a searchable large-scale repository (e.g. making your Operational DB accessible and explorable) you can create a custom template with Solr-HBase integration.

    • Available both in AWS and Azure, in CDP Public Cloud Data Hub

SDX

  • [Tech Preview] Ranger Authorization Service (RAZ): Fine-grained access controls for Azure ADLS

    • Reduce security risk, improve compliance and eliminate unintended data deletion with the addition of the Ranger Authentication Service (RAZ) for ADLS Gen 2. CDP on Azure users gain fine-grained access control policy and auditing capabilities on their file system accesses. Rather than handling access controls to cloud storage at the group level, individual users now can have individual home directories that are protected from other users. This unlocks a host of benefits:

      • Spark users can now work against files in their own directories.
      • Classification tags on file system directories can be propagated through data lineage
      • More complete migration for legacy HDP users with HDFS Ranger policies directly ported to CDP in Azure
      • Legacy CDH users now have the comparable file access audit capabilities to those Cloudera Navigator provided.
    • This Tech Preview is only available for Data Hub clusters

Management Console & Control Plane

  • System Auditing is now Generally Available

    • This will enable collection of control plane events for forensic analysis by an external system

    • This feature is only accessible via the CLI; UI support will be available in subsequent releases

  • Non-Transparent Proxy For SDX and DataHub clusters is Generally Available

    • This will make it easier to deploy SDX and DataHub clusters in an isolated network.
  • CDP Control Plane Public API Reference is available here

    • This will enable lookup of descriptions for SDK functions as well as CLI commands

Full release documentation including release notes for each service is available here (Service Name → Release Notes → What’s New).