What's New in Cloudera Documentation
This page describes new topics added and changes made to Cloudera documentation.
- What's New in Cloudera Documentation in May, 2019
- What's New in Cloudera Documentation in April, 2019
- What's New in Cloudera Documentation in March, 2019
- What's New in Cloudera Documentation in February, 2019
- What's New in Cloudera Documentation in January, 2019
- What's New in Cloudera Documentation in November, 2018
- What's New in Cloudera Documentation in October, 2018
- What's New in Cloudera Documentation in September, 2018
- What's New in Cloudera Documentation in August, 2018
- What's New in Cloudera Documentation in July, 2018
- What's New in Cloudera Documentation in June, 2018
- What's New in Cloudera Documentation in May, 2018
- What's New in Cloudera Documentation in April, 2018
- What's New in Cloudera Documentation in March, 2018
- What's New in Cloudera Documentation in February, 2018
- What's New in Cloudera Documentation in January, 2018
What's New in Cloudera Documentation in May, 2019
This section describes new topics added and major changes made to Cloudera documentation in May, 2019:
Product | What's New | Link |
---|---|---|
Cloudera Navigator |
Improved information on
|
What's New in Cloudera Documentation in April, 2019
This section describes new topics added and major changes made to Cloudera documentation in April, 2019:
Product | What's New | Link |
---|---|---|
Cloudera Navigator |
Added detailed information on how metadata is extracted from multiple sources and combined to generate lineage for data assets, specifically for Hive. This section helps understand when you can expect lineage to appear after data assets are created on the cluster. |
Metadata Extraction Timing |
What's New in Cloudera Documentation in March, 2019
This section describes new topics added and major changes made to Cloudera documentation in March, 2019:
Product | What's New | Link |
---|---|---|
Cloudera Data Science Workbench | Published a new Security Overview for Cloudera Data Science Workbench. This topic goes over the basics of the CDSW security model, the wildcard DNS requirement, and how authentication, authorization, and wire encryption work in Cloudera Data Science Workbench. | CDSW Security Overview |
What's New in Cloudera Documentation in February, 2019
This section describes new topics added and major changes made to Cloudera documentation in February, 2019:
Product | What's New | Link |
---|---|---|
Apache Impala | Added a new section on load-balancing proxy in TLS-enabled cluster. | Special Proxy Considerations for TLS/SSL-Enabled Clusters |
Added a new section on enabling LDAP in Cloudera Manager. | Enabling LDAP in Cloudera Manager | |
Updated docs to reflect decoupling of compute and storage in Hadoop clusters. | Components of the Impala Server | |
Apache Kudu | Added a recommendation to use nscd (name service caching daemon) for all name resolutions. | Slow Name Resolution and nscd |
Hue | The Hue Guide has been completely re-organized and additional content added. Look for further improvements in the areas of performance tuning and reference architectures to be released soon. |
What's New in Cloudera Documentation in January, 2019
This section describes new topics added and major changes made to Cloudera documentation in January, 2019:
Product | What's New | Link |
---|---|---|
Apache Spark ML | Learn how to configure Spark ML to use native math libraries that accelerate model training speed for algorithms like Alternating Least Squares (ALS). | Using Native Math Libraries to Accelerate Spark Machine Learning Applications |
Workload XM | The new Workload View feature enables you to break down workloads by specific criteria to perform deep-dive analysis on the queries. For example, you can use the Workload View feature to determine which users are executing workloads that do not adhere to SLAs. You can also examine how queries being sent to specific databases or that use specific pools are performing against SLAs. |
What's New in Cloudera Documentation in November, 2018
This section describes new topics added and major changes made to Cloudera documentation in November, 2018:
Product | What's New | Link |
---|---|---|
Apache HBase | Information about how to move the HBase Master role from one host to another. | Moving HBase Master Role to Another Host |
Cloudera Navigator Key Trustee KMS | Added a new procedure that describes how to move a Key Trustee KMS proxy service role instance from an existing cluster host to another cluster host. This feature is for 5.16.1 and later only. | Migrating a Key Trustee KMS Server Role Instance to a New Host |
Workload Experience Manager (Workload XM) | With the release of Cloudera Manager 5.16.1, new functionality has been added to Workload XM to redact logs and queries and for proxy server support. These new features can be enabled by configuring the Telemetry Publisher service in Cloudera Manager. In addition to these new features, now you can download the SQL commands to address "Corrupt Table Statistics" and "Missing Table Statistics" query health checks and numerous usability enhancements have also been added. | What's New from Workload XM |
Information about how to configure Workload XM to tunnel through a firewall in your environment. | Configuring a Firewall for Workload XM | |
Detailed description of the diagnostic data collection performed by Workload XM. | Workload XM Diagnostic Data Collection |
What's New in Cloudera Documentation in October, 2018
This section describes new topics added and major changes made to Cloudera documentation in October, 2018:
Product | What's New | Link |
---|---|---|
Cloudera Data Science Workbench | Released Cloudera Data Science Workbench 1.4.2.
This release fixes some critical bugs, including TSB-346: Risk of Data Loss on Cloudera Data Science Workbench Shutdown and Restart. Please read the TSB and Upgrade Notes carefully before you start upgrading or perform a shutdown/restart operation on any previous version of CDSW. |
|
Apache HDFS | Documented how to enable authorization for HDFS web UIs. | Enabling Authorization for HDFS Web UIs |
Apache Impala |
Documented using a query option to set an execution time limit on queries. |
Setting Time Limits on Long Running Queries |
Documented the scheduler-related query hints and options support Kudu tablets. |
||
Apache Kudu | Noted that it is better to let Kudu manage its own striping over multiple devices rather than delegating the striping to a RAID-0 array. | Kudu Configuration |
Cloudera Navigator |
Added detailed steps for streaming Navigator audit events to a Kafka topic. |
Publishing Audit Events to Kafka |
Added tips on how to get the most from your Navigator Audit Server implementation, including some maintenance steps that will help you make sure you are collecting the right audit events. |
Maintaining Navigator Audit Server | |
Apache Sentry | Updated the Sentry privilege tables for Hive and Impala. They now include all possible privileges on each possible scope. | Privilege Tables for Hive and Impala |
What's New in Cloudera Documentation in September, 2018
This section describes new topics added and major changes made to Cloudera documentation in September, 2018:
Product | What's New | Link |
---|---|---|
Apache Impala |
Documented missing query options:
|
|
Added a table of all Impala functions with links to each function as an alternative approach to having a page for each built-in function. |
Impala Built-In Functions | |
Reformatted the built-in functions docs format change for better readability. |
||
Re-factored the Impala Authorization doc with the focus on Sentry privilege model and the de-emphasis on the policy file-based model. |
Enabling Sentry Authorization for Impala | |
Apache Kudu |
Added a best practice section in the Kudu-Spark Integration avoiding multiple Kudu clients per cluster. |
Developing Applications With Apache Kudu |
Added the troubleshooting info on detecting ext2 and ext3 filesystems. |
Troubleshooting Apache Kudu | |
Apache YARN | Added new topic that describes all aspects of creating and managing YARN ACLs. | Managing YARN ACLs |
What's New in Cloudera Documentation in August, 2018
This section describes new topics added and major changes made to Cloudera documentation in August, 2018:
Product | What's New | Link |
---|---|---|
Cloudera Data Science Workbench | Added two new videos that demonstrate how to run experiments and deploy models with Cloudera Data Science Workbench. | |
Apache Sentry | There is a new video on the Cloudera YouTube channel that shows how you can verify that your HDFS ACLs are synching with Sentry. The video also shows that URI privileges are not applied as ACLs in HDFS. | How to verify that HDFS ACLs are synching with Sentry |
CDH - YARN | Updated the YARN tuning guide with new values. | Tuning YARN |
Cloudera Altus | Description of Altus groups and their usage. | Groups |
Cloudera Navigator | Added a new video that describes how to make sure your audit system is doing what you expect: are you collecting the right events? are you retaining them as long as you need them? are you archiving them where they are retrievable? | Navigator Audit Checkup Video [Youtube] |
Workload Experience Manager (Workload XM) | Cloudera's Workload XM launched this month and with it a new documentation set that explains how to use this tool to gain in-depth understanding of the workloads you send to clusters managed by Cloudera Manager. It provides information that can be used for troubleshooting failed jobs and for optimizing slow jobs that run on those clusters. | Workload Experience Manager |
What's New in Cloudera Documentation in July, 2018
This section describes new topics added and major changes made to Cloudera documentation in July, 2018:
Product | What's New | Link |
---|---|---|
Cloudera Data Science Workbench |
Released Cloudera Data Science Workbench 1.4 with new features: Experiments and Models |
1.4 Release Notes |
Reorganized documentation to align with major product components: Projects, Jobs, Experiments, Models, Engines, and Site Administration. |
||
Improved LDAP/SAML experience with support for group filters. | LDAP and SAML | |
New consolidated section for Engines in Cloudera Data Science Workbench. |
Engines Overview | |
New topic that describes how engines are used for experiments and models. |
Engines for Experiments & Models | |
This section also includes a topic that lists all the pre-installed packages in CDSW's Python and R kernels. |
Pre-Installed Python and R Packages | |
Provided more code samples that demonstrate how to access cluster data from CDSW. | Data Access | |
Cloudera Navigator | Added a new video that gives a light-hearted look at the Cloudera Navigator brand and helps identify the value of each of the Navigator components. | Navigator Brand Video [Youtube] |
Reference Architectures | Cloudera Reference Architectures are now available in HTML format. | Reference Architectures |
Apache Sentry | After upgrading to CDH 5.13.0 and above, some customers experience a period of time in which HDFS ACLs are not synched. Possible reasons for this problem are explained in the Release Notes, along with affected versions and fixes. | |
Cloudera Altus | Added a description of the optional ec2:DeleteKeyPair permission in the AWS cross-account role that determines how Altus generates key pairs for clusters. | Key Pair Permissions on EC2 |
What's New in Cloudera Documentation in June, 2018
This section describes new topics added and major changes made to Cloudera documentation in June, 2018:
Product | What's New | Link |
---|---|---|
Apache Hive - HiveServer2 High Availability | Added new command-line instructions for configuring a proxy load balancer to support HiveServer2 high availability on unmanaged clusters (those not managed by Cloudera Manager) with or without Kerberos. | Configuring HiveServer2 to Load Balance Behind a Proxy on Unmanaged Clusters |
Apache Sentry | Clarified the GRANT ROLE statement on group name restrictions, such as character restrictions, how to use backticks with those restrictions, and OS group name requirements. | GRANT ROLE Statement |
Clarified the description of what happens to synchronized ACLs during Sentry service failure. | HDFS/Sentry Synchronized Permissions | |
Added instructions for how to override Sentry's Kerberos prerequisite for the Hive metastore in Cloudera Manager. | Securing the Hive Metastore | |
Added new Amazon S3 information on creating a table in a bucket. | Creating a Table in a Bucket | |
Added new information on the privileges the Sentry Admin needs in HUE. | Hive SQL Syntax for Use with Sentry | |
Added the SHOW CREATE VIEW operation to the Hive and Impala privilege tables. | Authorization Privilege Model for Hive and Impala | |
Added a new example explaining how a user may see data from a database that they do not have access to if that data is in a view. | Authorization Privilege Model for Hive and Impala | |
CDK Powered by Apache Kafka | Updated Kafka Requirements and Supported Versions with additional information about compatibility:
|
|
Apache ZooKeeper | Provided instructions for configuring the ZooKeeper server for Kerberos authentication using Cloudera Manager. | Configuring ZooKeeper Server for Kerberos Authentication |
Cloudera Manager | Added a new procedure that describes how to migrate from the Cloudera Manager Embedded PostgreSQL database server to an external PostgreSQL database. | Migrating from the Cloudera Manager Embedded PostgreSQL Database Server to an External PostgreSQL Database |
Cloudera Navigator | Added information on lineage in Navigator: what information is collected and how it is used to create lineage diagrams, what entities are captured in the diagrams, and how diagrams change through the lifecycle of data assets. | Generating Lineage Diagrams |
Cloudera Altus | Added a description of a new Altus environment option for secure clusters. | Enable Secure Clusters |
What's New in Cloudera Documentation in May, 2018
This section describes new topics added and major changes made to Cloudera documentation in May, 2018:
Product | What's New | Link |
---|---|---|
Cloudera Upgrade | Added a new interactive topic that walks you through the steps to upgrade Cloudera Manager. You can select your operating system, upgrade version, and database type and a customized page displays the steps for your upgrade. | Upgrading Cloudera Manager Using Packages |
Added a new interactive topic that walks you through the steps to upgrade CDH using Cloudera Manager. You can select your Cloudera Manager version, CDH upgrade version, and other information and a customized page displays the steps for your upgrade. | Upgrading CDH | |
HDFS Transparent Encryption | Extensively revised the KMS ACL topic, which now includes descriptions of all operations for each ACL class, as well as a diagram and explanation that guides readers through the process of how the KMS evaluates the various ACL classes. | Configuring KMS Access Control Lists (ACLs) |
Key Trustee KMS HA | Added new documentation for a feature that provides logic to detect and warn users about a potential problem where the GPG private keys have not been properly synchronized across all Key Trustee KMS HA hosts. | |
Cloudera Navigator HSM KMS | Added a new topic to guide users through the steps to upgrade an HSM KMS. | Upgrading Cloudera Navigator HSM KMS |
HBase | Added new content that describes how to configure and enable cell-level ACLs for HBase. | Configure Cell-Level Access Control Lists |
Hue | Added new content that clarifies how to migrate the Hue database for MariaDB and MySQL. | MariaDB / MySQL |
Cloudera Altus | Added information about defining custom tags for clusters.
Added information about Altus support for CDH 5.14. |
Creating a Cluster for AWS |
Restructured Altus documentation to create one doc set for Altus on AWS and Altus on Azure. Altus documentation now includes an Administration Guide and a Data Engineering Guide. | Overview of Cloudera Altus |
What's New in Cloudera Documentation in April, 2018
This section describes new topics added and major changes made to Cloudera documentation in April, 2018:
Product | What's New | Link |
---|---|---|
Cloudera Altus | Added a new topic that describes how to set up an Altus trial account. | Getting Started with a Trial Account |
What's New in Cloudera Documentation in March, 2018
This section describes new topics added and major changes made to the Cloudera documentation library in March, 2018:
Product | What's New | Link |
---|---|---|
Cloudera Data Science Workbench | Added a new video that demonstrates how to get started with a Cloudera Data Science Workbench built-in template project. | CDSW Quickstart Demo [Youtube] |
Added new Known Issues for Cloudera Manager and CDH integration. | Known Issues | |
Added a new topic on migrating a CDSW Deployment to Another Host. | Migrating a CDSW Deployment | |
Revamped the Backup topic with detailed instructions. | Creating a Backup | |
Added a new topic on how to uninstall Cloudera Data Science Workbench. | Uninstalling CDSW | |
JDK Requirements | Added new section on Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction requirements. | Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction |
Navigator | Added a new example for Navigator Audit Server on how to use audit events to determine what caused a schema change to a table. Use audit reports to identify the user or process that may be causing unwanted changes. | Who ran which operation against a table? |
Cloudera Director | Added a new topic on using custom DNS names and DNS servers with auto-TLS. | Using Custom DNS with Auto-TLS in AWS |
What's New in Cloudera Documentation in February, 2018
This section describes new topics added and major changes made to Cloudera documentation in February, 2018:
Product | What's New | Link |
---|---|---|
Flume | The Apache Flume content is moved to a new Flume Guide. Information for configuring, using, and managing Flume is consolidated in the Flume Guide. | Flume Guide |
HBase | The Apache HBase content is moved to a new HBase Guide. All the information for configuring, managing, and troubleshooting HBase is in one central location. | HBase Guide |
Key HSM | Added a new section describing the file naming convention used for encryption zone keys. | Key Naming Convention |
HDFS (Encryption) | Added a new section describing how to resolve an error that can occur when the KMS jute buffer size is insufficient to hold all the tokens. | KMS server jute buffer exception |
Sentry | The Apache Sentry content is moved to a new Sentry Guide. The Sentry Guide contains information on configuring, using, and troubleshooting Sentry, as well as how-to guides. | Sentry Guide |
Cloudera Altus | Added a new topic that describes how to use the Cloudera Altus SDK for Java. | Using the Altus SDK for Java |
What's New in Cloudera Documentation in January, 2018
This section describes new topics added and major changes made to Cloudera documentation in January, 2018:
Product | What's New | Link |
---|---|---|
Cloudera Data Science Workbench | Released Cloudera Data Science Workbench 1.3.0. | |
Impala | Added tip about using Kudu Java API, instead of JDBC interface, for rapid insert operations. | Configuring Impala to Work with JDBC |
Added DATE_TRUNC() function. | Impala Date and Time Functions | |
Added new upper limit for BATCH_SIZE query option. | BATCH_SIZE Query Option | |
Added information about a new kind of runtime filter, the "min-max" filter, which applies to join queries involving Kudu tables. | Using Impala to Query Kudu Tables | |
Added new conditional operators: IS [NOT] TRUE, IS [NOT] FALSE, and IS [NOT] UNKNOWN. | SQL Operators | |
Added information about changes to the output of the SET statement, dividing the options into multiple groups, and hiding some groups by default. New SET ALL syntax shows all the option groups. | SET Statement | |
Added a new impala-shell option --query_option and configuration file section [impala.query_options]. These features both allow specifying values for query options when starting impala-shell. | impala-shell Configuration Options | |
Kafka | Updated examples and removed deprecated properties for how to use Kafka with Flume. | Using Kafka with Flume |
Kafka | Updated Kafka upgrade topic to include versions. | Rolling Upgrade to Kafka 3.0.x |
Key Trustee KMS | Added new procedure for migrating from a Key Trustee KMS (KT KMS) to a Hardware Security Module KMS (HSM KMS). | Migrating from a Key Trustee KMS to an HSM KMS |
Cloudera Manager |
Added information on using Cloudera Manager to configure credentials for cluster access to Microsoft ADLS. This access is enabled for running Hive and Impala queries on tables backed by data stored in ADLS and to browse ADLS data using Hue. |
Configuring ADLS Access Using Cloudera Manager |
Added information on how to enable performing minor maintenance on cluster hosts, Cloudera Manager now fully manages the host decommission and recommission process. You can specify whether or not to replicate under-replicated data blocks to other DataNodes to maintain the cluster's replication factor during a maintenance window. | Tuning and Troubleshooting Host Decommissioning | |
Provided examples for how to use the API to manage BDR. |
How To Automate BDR Replication with the Cloudera Manager API | |
Added a video walkthrough for how to add a cluster to Cloudera Manager. |
||
Cloudera Director |
Added inforamtion on Cloudera Director 2.7 configuration option to point to an organization’s LDAP server so that users common credentials may be used to login to Cloudera Director. When enabling LDAP support, Cloudera Director’s built in user management is disabled. |
Configuring Cloudera Director Server for LDAP and Active Directory |
Added information that Cloudera Director can handle all aspects of Java installation on the instances that it allocates and configures for Cloudera Manager and CDH clusters, offering more flexibility while simplifying the process for users. |
Deploying Java on Cluster Instances | |
Added a configuration option to Cloudera Director's AWS plugin to accommodate regions like GovCloud and China, where EC2 cannot tag instances upon creation. The documentation now includes the procedure for configuring the plugin to use this option. | Configuring Tag-on-create for AWS GovCloud (US) and China (Beijing) Regions | |
Cloudera Navigator |
Added information that metadata searches in Navigator now include the ability to group search results by common properties. Group by lets you use technical, managed, and custom metadata to quickly identify small files, active SQL users, table-creation trends, and other data aggregation trends revealed by metadata properties. The documentation includes some examples of how grouping search results can help you understand trends in your data and to find specific data assets. |
Grouping Search Results Using Metadata |
Updated Navigator role names to more clearly reflect the privileges they provide. One specific change is that the privilege for editing the name and description metadata for Navigator entities is now part of the Managed & Custom Metadata Editor role. Users with that role or the Full Administrator role can add and update entity names and descriptions in the Navigator console. |
Cloudera Navigator User Roles | |
Added information that audit filtering now allows a "not like" operator. |
Filtering Audit Events | |
Added information on handling sensitive data that links to the Cloudera Manager log redaction details. | Sensitive Data | |
The documentation now includes the specific metadata removed during Navigator Metadata Server purge tasks. | What Metadata is Purged? | |
Kudu |
Added new features and updates to Kudu administration:
|
Kudu Administration |
Specified how client applications connect to Kerberized Kudu servers. | Client Authentication to Secure Kudu Clusters | |
Cloudera Altus | Added a description of public keys for cluster creation. | Creating a Cluster for AWS |