Homepage
/
Cloudera Base on premises upgrade
Search Documentation
▶︎
Cloudera
Reference Architectures
▶︎
Cloudera Public Cloud
Getting Started
Patterns
Preview Features
Data Catalog
Data Engineering
DataFlow
Data Hub
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Management Console
Operational Database
Replication Manager
DataFlow for Data Hub
Runtime
▼
Cloudera Private Cloud
Data Services
Getting Started
Cloudera Manager
Management Console
Replication Manager
Data Catalog
Data Engineering
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Base
Getting Started
Runtime
Upgrade
Storage
Flow Management
Streaming Analytics
Flow Management Operator
Streaming Analytics Operator
Streams Messaging Operator
▶︎
Cloudera Manager
Cloudera Manager
▶︎
Applications
Cloudera Streaming Community Edition
Data Science Workbench
Data Visualization
Edge Management
Observability SaaS
Observability on premises
Workload XM On-Prem
▶︎
Legacy
Cloudera Enterprise
Flow Management
Stream Processing
HDP
HDF
Streams Messaging Manager
Streams Replication Manager
▶︎
Data Services
Getting Started
Cloudera Manager
Management Console
Replication Manager
Data Catalog
Data Engineering
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Base
Getting Started
Runtime
Upgrade
Storage
Flow Management
Streaming Analytics
Flow Management Operator
Streaming Analytics Operator
Streams Messaging Operator
«
Filter topics
Cloudera Base on premises Upgrade
▶︎
Release Guide
Cloudera Base on premises Release Guide
Cloudera Release Notes
▶︎
Version and Download Information
Cloudera Manager Version Information
Cloudera Manager Download Information
Cloudera Runtime Version Information
Cloudera Runtime Download Information
Cloudera Base on premises Trial Download Information
▶︎
Product Compatibility Matrices
Replication Manager
Cloudera Manager
▶︎
KMS and Encryption Products
Ranger KMS
Navigator Encrypt
KTS and Key HSM
HSM Support
▶︎
Changes to CDH and HDP Components in Cloudera Base on premises
Updated CDH Components
Updated HDP Components
HDP Core component version changes
Changes to Ambari and HDP services
▶︎
Assessing the Impact of Apache Hive
Apache Hive features Apache Hive features in Cloudera Data Hub Apache Hive features in Cloudera Data Warehouse
Apache Hive 3 architectural overview Apache Hive 3 in Cloudera Data Hub architectural overview Apache Hive 3 in Cloudera Data Warehouse architectural overview
▶︎
Key semantic changes and workarounds
Casting timestamps
Casting invalid dates
Changing incompatible column types
Understanding CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Handling table reference syntax
Add Backticks to Table References
Handling the Keyword APPLICATION
Dropping partitions
Handling output of greatest and least functions
Renaming tables
TRUNCATE TABLE on an external table
Hive unsupported interfaces and features
▶︎
Supplemental Upgrade Topics
Configuring a Local Package Repository
Configuring a Local Parcel Repository
Changes to CDH Hive Tables
Changes to HDP Hive tables
Transitioning Embedded PostgreSQL Database to External PostgreSQL Database
▶︎
Getting Started with Cloudera Upgrade and Migration
Cloudera Upgrade and Migrations Paths
Downloadable Cloudera upgrade checklists
▶︎
Supported in-place upgrade paths
Cloudera Manager Support Matrix
▶︎
Cloudera Base on premises requirements and supported versions
▶︎
Hardware Requirements
▶︎
Cloudera Manager
Cloudera Manager Server
Service Monitor Requirements
Host Monitor
Reports Manager
Agent Hosts
Event Server
Alert Publisher
▶︎
Cloudera Runtime
Atlas
HDFS
HBase
Hive
Hue
Impala
Kafka
Key Trustee Server
Ranger KMS
Kudu
Oozie
Ozone
Phoenix
Ranger
Solr
Spark
Livy
YARN
ZooKeeper
Operating System Requirements
Database Requirements
Java Requirements
Networking and Security Requirements
Data at Rest Encryption Requirements
▶︎
Third-party filesystems
Dell EMC PowerScale
IBM Spectrum Scale
Data Migration Versus Upgrade
▼
In-Place Upgrades
▶︎
From Cloudera Base on premises
Overview
About using this online Upgrade Guide
How much time should I plan for to complete my upgrade?
▶︎
Upgrading the JDK
Using AES-256 Encryption
Configuring a Custom Java Home Location
Tuning JVM Garbage Collection
▶︎
Upgrading the Operating System to a new Major Version
Step 1: Getting Started
▶︎
Step 2: Backing Up
Backing up Cloudera Manager databases
Step 3: Before You Upgrade
Step 4: After You Upgrade
▶︎
Upgrading the Operating System to a new Minor Version
Step 1: Getting Started
▶︎
Step 2: Backing Up
Backing up Cloudera Manager databases
Step 3: Before You Upgrade
Step 4: After You Upgrade
▶︎
Upgrading Cloudera Manager 7
Step 1: Getting Started Upgrading Cloudera Manager 7
▶︎
Step 2: Backing Up
Collect Information
Back Up Cloudera Manager Agent
Back Up the Cloudera Management Service
Stop Cloudera Manager Server & Cloudera Management Service
Back Up the Databases
Back Up Cloudera Manager Server
(Optional) Start Cloudera Manager Server & Cloudera Management Service
Step 3: Upgrading the Server
Step 4: Upgrading the Agents
▶︎
Step 5: After You Upgrade
Upgrading Cloudera Navigator Key Trustee Server 7.1.x
Upgrading Cloudera Navigator Key HSM
Upgrading Cloudera Navigator Encrypt
Troubleshooting
Reverting a Failed Upgrade
▶︎
Installing dependencies for Hue
Installing the psycopg2 Python package for PostgreSQL database
Installing MySQL client for MySQL databases
Installing MySQL client for MariaDB databases
▶︎
Getting started with Zero Downtime Upgrade (ZDU)
Software Requirements
ZDU Component Support
Service components limitations
Glossary of terminologies
▶︎
Upgrading a Cluster
Step 1: Getting Started
Step 2: Review Notes and Warnings
▶︎
Step 3: Backing Up the Cluster
▶︎
Prerequisites for external database
Oozie database prerequisites
Ranger database prerequisites
▶︎
Step 4: Back Up
Collect Information
Back Up Cloudera Manager Agent
Back Up the Cloudera Management Service
Stop Cloudera Manager Server & Cloudera Management Service
Back Up the Databases
Back Up Cloudera Manager Server
(Optional) Start Cloudera Manager Server & Cloudera Management Service
Step 5: Access Parcels
Step 6: Enter Maintenance Mode
Step 7: Run the Upgrade Cluster Wizard
Step 8: Finalize the HDFS or Ozone Upgrade
Step 9: Complete Post-Upgrade steps for upgrades to Cloudera Base on premises
Step 10: Exit Maintenance Mode
ZDU known issues
Applying a Service Pack
How to Install Cumulative Hotfix (CHF)
Manual upgrade to Cloudera Base on premises
Troubleshooting
Configuring a Local Package Repository
Configuring a Local Parcel Repository
Applications Upgrade
▶︎
Rollback and Downgrade Cloudera Base on premises
▶︎
Procedure to Downgrade or Rollback from Cloudera Base on premises 7.3.1
Procedure to Downgrade from Cloudera Base on premises 7.3.1
Procedure to Rollback from Cloudera Base on premises 7.3.1
▶︎
Procedure to Downgrade or Rollback from CDP Private Cloud Base 7.1.9
Procedure to Downgrade from CDP Private Cloud Base 7.1.9
Procedure to Rollback from CDP Private Cloud Base 7.1.9
Procedure to Rollback from CDP Private Cloud Base 7.1.9 SP1 to CDP Private Cloud Base 7.1.9
Procedure to Rollback from CDP Private Cloud Base 7.1.9 SP1 to CDP Private Cloud Base 7.1.7 SP3
Procedure to Rollback from CDP Private Cloud Base 7.1.9 SP1 to CDP Private Cloud Base 7.1.8 latest cumulative hotfix
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP3 to CDP Private Cloud Base 7.1.7 SP2
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP3 to CDP Private Cloud Base 7.1.7 SP1
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP3 to CDP Private Cloud Base 7.1.7
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP3 to CDP Private Cloud Base 7.1.6
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP2 to CDP Private Cloud Base 7.1.7 SP1
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP1 to CDP Private Cloud Base 7.1.7
Procedure to Rollback from CDP Private Cloud Base 7.1.8 to CDP Private Cloud Base 7.1.7 SP1
▶︎
CDH 6 to CDP Private Cloud Base
Preparing for your upgrade
Assessing the Impact of an Upgrade
How much time should I plan for to complete my upgrade?
About using this online Upgrade Guide
▶︎
Cloudera Base on premises Pre-upgrade transition steps
Set log level for KeyTrustee KMS to INFO
Transitioning from Sentry Policy Files to the Sentry Service
▶︎
Transitioning the Sentry service to Apache Ranger
Configuring a Ranger or Ranger KMS Database: MySQL/MariaDB
Configuring a Ranger Database: PostgreSQL
Configuring a Ranger or Ranger KMS Database: Oracle
Import Key Trustee KMS ACLs to Ranger KMS policies
▶︎
Transitioning Navigator content to Atlas
▶︎
Transition process
Assumptions and prerequisites
Installing Atlas in the Cloudera Manager upgrade wizard
Transitioning Navigator data using customized scripts
Mapping Navigator metadata to Atlas
Transitioning Navigator audits
What's new in Atlas for Navigator Users?
Preparing the backend HMS database for upgrade
▶︎
Migrating Hive 1-2 to Hive 3
Hive Configuration Changes Requiring Consent
Remove transactional=false from Table Properties
Check SERDE Definitions and Availability
▶︎
Checking Apache HBase
Check co-processor classes
Clean the HBase Master procedure store
CDH cluster upgrade requirements for Replication Manager
▶︎
Upgrading the JDK
Using AES-256 Encryption
Configuring a Custom Java Home Location
Tuning JVM Garbage Collection
▶︎
Upgrading the Operating System
Step 1: Getting Started
▶︎
Step 2: Backing Up
Backing up Cloudera Manager databases
Step 3: Before You Upgrade
Step 4: After You Upgrade
▶︎
Upgrading Cloudera Manager 6
Step 1: Getting Started Upgrading Cloudera Manager 6
▶︎
Step 2: Backing Up
Collect Information
Back Up Cloudera Manager Agent
Back Up the Cloudera Management Service
Back Up Cloudera Navigator Data
Stop Cloudera Manager Server & Cloudera Management Service
Back Up the Databases
Back Up Cloudera Manager Server
(Optional) Start Cloudera Manager Server & Cloudera Management Service
Step 3: Upgrading the Server
Step 4: Upgrading the Agents
▶︎
Step 5: After You Upgrade
Upgrade Key Trustee Server to 7.1.x
Upgrade Navigator Encrypt to 7.1.x
Upgrading Cloudera Navigator Key HSM
Upgrading Key Trustee KMS
Troubleshooting
Reverting a Failed Upgrade
Validate TLS configurations
▶︎
Expediting the Hive upgrade
▶︎
Overview of the expedited Hive upgrade
▶︎
Preparing tables for migration
Check SERDE Definitions and Availability
Handle Missing Table or Partition Locations
Managed Table Location Mapping
Make Tables SparkSQL Compatible
Configuring HSMM to prevent migration
Understanding the Hive upgrade
▶︎
Installing dependencies for Hue
Installing the psycopg2 Python package for PostgreSQL database
Installing MySQL client for MySQL databases
Installing MySQL client for MariaDB databases
▶︎
Upgrading a CDH 6 Cluster
Step 1: Getting Started
Step 2: Review Notes and Warnings
Step 3: Backing Up the Cluster
▶︎
Step 4: Back Up Cloudera Manager
Collect Information
Back Up Cloudera Manager Agent
Back Up the Cloudera Management Service
Back Up Cloudera Navigator Data
Stop Cloudera Manager Server & Cloudera Management Service
Back Up the Databases
Back Up Cloudera Manager Server
(Optional) Start Cloudera Manager Server & Cloudera Management Service
▶︎
Step 5: Complete Pre-Upgrade steps for upgrades to Cloudera Base on premises
Run Hue Document Cleanup
Check Oracle Database Initialization
Step 6: Access Parcels
Step 7: Configure Streams Messaging Manager
Step 8: Configure Schema Registry
Step 9: Enter Maintenance Mode
▶︎
Step 10: Run the Upgrade Cluster Wizard
▶︎
Fair Scheduler to Capacity Scheduler transition
▶︎
Plan your scheduler transition
Scheduler transition limitations
Placement Rules transition
Auto-converted Fair Scheduler properties
Fair Scheduler features and conversion details
▶︎
Use the fs2cs conversion utility
CLI options of the fs2cs conversion tool
▶︎
Manual configuration of scheduler properties
Manually add the configurations of yarn-site.xml
Use YARN Queue Manager UI to configure scheduler properties
Use Cloudera Manager Safety Valves to configure scheduler properties
Configure TLS/SSL for Ranger in a manually configured TLS/SSL environment
Step 11: Finalize the HDFS or Ozone Upgrade
Step 12: Complete Post-Upgrade steps for upgrades to Cloudera Base on premises
Step 13: Exit Maintenance Mode
Troubleshooting
Manual upgrade to Cloudera Base on premises
Rolling Back a Cloudera Private Cloud Base Upgrade from version 7.1.9 to CDH 6
Rolling Back a Cloudera Private Cloud Base Upgrade from versions 7.1.1 - 7.1.7 to CDH 6
Rolling Back a Cloudera Private Cloud Base Upgrade from version 7.1.8 to CDH 6
Configuring a Local Package Repository
Configuring a Local Parcel Repository
▶︎
CDH 6 to Cloudera Base on premises post-upgrade transition steps
Update permissions for Replication Manager service
▶︎
Migrating Spark workloads to Cloudera
▶︎
Spark 2.3 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 2.3 to Spark 2.4 changes
Empty schema not supported
CSV header and schema match
Table properties support
Managed table location
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
Compiling and running Spark workloads
Post-migration tasks
▶︎
Apache Hive Expedited Migration Tasks
▶︎
Preparing tables for migration
Check SERDE Definitions and Availability
Handle Missing Table or Partition Locations
Managed Table Location Mapping
Make Tables SparkSQL Compatible
Creating a list of tables to migrate
Migrating tables to CDP
▶︎
Apache Hive Changes in CDP
▶︎
Preparing tables for migration
Check SERDE Definitions and Availability
Handle Missing Table or Partition Locations
Managed Table Location Mapping
Make Tables SparkSQL Compatible
Hive Configuration Property Changes
LOCATION and MANAGEDLOCATION clauses
▶︎
Handling table reference syntax
Add Backticks to Table References
▶︎
Key semantic changes and workarounds
Casting timestamps
Casting invalid dates
Changing incompatible column types
Understanding CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Handling the Keyword APPLICATION
Handling output of greatest and least functions
TRUNCATE TABLE on an external table
Hive unsupported interfaces and features
Changes to CDH Hive Tables
▶︎
Apache Hive Post-Upgrade Tasks
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Fixing the canary test after upgrading
Configuring HiveServer for ETL using YARN queues
Removing Hive on Spark Configurations
Configuring authorization to tables
Making the Hive plugin for Ranger visible
Setting up access control lists
Configure encryption zone security
Configure edge nodes as gateways
Spark integration with Hive
Configure HiveServer HTTP mode
Configuring HMS for high availability
Installing Hive on Tez and adding a HiveServer role
▶︎
Updating Hive and Impala JDBC/ODBC drivers
Getting the JDBC driver
Getting the ODBC driver
▶︎
Apache Impala changes in CDP
Set ACLs for Impala
Impala Configuration Changes
Interoperability between Hive and Impala
Revert to CDH-like Tables
Authorization Provider for Impala
Data Governance Support by Atlas
Handling Data Files
▶︎
Hue post-upgrade tasks
Updating group permissions for Hive query editor
Adding Security Browser to the blocked list of applications
Adding Query Processor service to a cluster
Importing Sentry privileges into Ranger policies
Apache Knox - create plugin audit directory
Apache Ranger TLS Post-Upgrade Tasks
▶︎
Migrating ACLs from Key Trustee KMS to Ranger KMS
Key Trustee KMS operations not supported by Ranger KMS
ACLs supported by Ranger KMS and Ranger KMS Mapping
Apache Hadoop YARN default value changes
Upgrade Notes for Apache Kudu 1.15 / CDP 7.1
Apache HBase post-upgrade tasks
Configure SMM to monitor SRM replications
Configure SMM's service dependency on Schema Registry
Apache Sqoop Changes
Cloudera Search changes
Applications Upgrade
▶︎
CDH 5 to CDP Private Cloud Base
Preparing for your upgrade
Assessing the Impact of an Upgrade
How much time should I plan for to complete my upgrade?
About using this online Upgrade Guide
▶︎
Cloudera Base on premises Pre-upgrade transition steps
Set log level for KeyTrustee KMS to INFO
▶︎
Transitioning from MapReduce 1 to MapReduce 2
Upgrading an MRv1 installation using Cloudera Manager
Major changes when migrating to MapReduce 2
Configuration changes between MRv1 and MRv2
▶︎
Transitioning Cloudera Search configuration
Before you begin
Create HDFS backup directory
Set Solr configuration properties
Download the configuration
Back up Solr configuration and data
▶︎
Transition Solr configuration
Cloudera Manager versions 7.1.1 to 7.2.4
Cloudera Manager versions 7.3.1 or higher
Validate the configuration
Test the configuration
Copy the transitioned configuration to the upgrade metadata directory
About Solr configuration transformation script
Transitioning from Sentry Policy Files to the Sentry Service
▶︎
Transitioning the Sentry service to Apache Ranger
Configuring a Ranger or Ranger KMS Database: MySQL/MariaDB
Configuring a Ranger Database: PostgreSQL
Configuring a Ranger or Ranger KMS Database: Oracle
▶︎
Transitioning Navigator content to Atlas
▶︎
Transition process
Assumptions and prerequisites
Installing Atlas in the Cloudera Manager upgrade wizard
Transitioning Navigator data using customized scripts
Mapping Navigator metadata to Atlas
Transitioning Navigator audits
What's new in Atlas for Navigator Users?
Preparing the backend HMS database for upgrade
▶︎
Migrating Hive 1-2 to Hive 3
Hive Configuration Changes Requiring Consent
Check SERDE Definitions and Availability
Handle Missing Table or Partition Locations
Remove transactional=false from Table Properties
▶︎
Checking Apache HBase
Remove PREFIX_TREE Data Block Encoding
Validate HFiles
Check co-processor classes
CDH cluster upgrade requirements for Replication Manager
▶︎
Upgrading the JDK
Using AES-256 Encryption
Configuring a Custom Java Home Location
Tuning JVM Garbage Collection
▶︎
Upgrading the Operating System
Step 1: Getting Started
▶︎
Step 2: Backing Up
Backing up Cloudera Manager databases
Step 3: Before You Upgrade
Step 4: After You Upgrade
▶︎
Upgrading Cloudera Manager 5
Step 1: Getting Started Upgrading Cloudera Manager 5
▶︎
Step 2: Backing Up
Collect Information
Back Up Cloudera Manager Agent
Back Up the Cloudera Management Service
Back Up Cloudera Navigator Data
Stop Cloudera Manager Server & Cloudera Management Service
Back Up the Databases
Back Up Cloudera Manager Server
(Optional) Start Cloudera Manager Server & Cloudera Management Service
Step 3: Upgrading the Server
Step 4: Upgrading the Agents
▶︎
Step 5: After You Upgrade
Upgrade Key Trustee Server to 7.1.x
Upgrade Navigator Encrypt to 7.1.x
Upgrading Cloudera Navigator Key HSM
Troubleshooting
Reverting a Failed Upgrade
Validate TLS configurations
▶︎
Expediting the Hive upgrade
▶︎
Overview of the expedited Hive upgrade
▶︎
Preparing tables for migration
Check SERDE Definitions and Availability
Handle Missing Table or Partition Locations
Managed Table Location Mapping
Make Tables SparkSQL Compatible
Configuring HSMM to prevent migration
Understanding the Hive upgrade
▶︎
Installing dependencies for Hue
Installing the psycopg2 Python package for PostgreSQL database
Installing MySQL client for MySQL databases
Installing MySQL client for MariaDB databases
▶︎
Upgrading a CDH 5 Cluster
Step 1: Getting Started
Step 2: Review Notes and Warnings
Step 3: Backing Up the Cluster
▶︎
Step 4: Back Up Cloudera Manager
Collect Information
Back Up Cloudera Manager Agent
Back Up the Cloudera Management Service
Back Up Cloudera Navigator Data
Stop Cloudera Manager Server & Cloudera Management Service
Back Up the Databases
Back Up Cloudera Manager Server
(Optional) Start Cloudera Manager Server & Cloudera Management Service
▶︎
Step 5: Complete Pre-Upgrade steps for upgrades to Cloudera Base on premises
Run Hue Document Cleanup
Check Oracle Database Initialization
Step 6: Access Parcels
Step 7: Configure Streams Messaging Manager
Step 8: Configure Schema Registry
Step 9: Enter Maintenance Mode
▶︎
Step 10: Run the Upgrade Cluster Wizard
▶︎
Fair Scheduler to Capacity Scheduler transition
▶︎
Plan your scheduler transition
Scheduler transition limitations
Placement Rules transition
Auto-converted Fair Scheduler properties
Fair Scheduler features and conversion details
▶︎
Use the fs2cs conversion utility
CLI options of the fs2cs conversion tool
▶︎
Manual configuration of scheduler properties
Manually add the configurations of yarn-site.xml
Use YARN Queue Manager UI to configure scheduler properties
Use Cloudera Manager Safety Valves to configure scheduler properties
Configure TLS/SSL for Ranger in a manually configured TLS/SSL environment
Step 11: Finalize the HDFS or Ozone Upgrade
Step 12: Complete Post-Upgrade steps for upgrades to Cloudera Base on premises
Step 13: Exit Maintenance Mode
Troubleshooting
Manual upgrade to Cloudera Base on premises
Rolling back a Cloudera Base on premises 7 upgrade to CDH 5
Configuring a Local Package Repository
Configuring a Local Parcel Repository
▶︎
CDH 5 to Cloudera Base on premises post-upgrade transition steps
Update permissions for Replication Manager service
▶︎
Migrating Spark workloads to Cloudera
▶︎
Spark 1.6 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 1.6 to Spark 2.4 changes
New Spark entry point SparkSession
Dataframe API registerTempTable deprecated
union replaces unionAll
Empty schema not supported
Referencing a corrupt JSON/CSV record
Dataset and DataFrame API explode deprecated
CSV header and schema match
Table properties support
Managed table location
Write to Hive bucketed tables
Rounding in arithmetic operations
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
▶︎
Compiling and running Spark workloads
Compiling and running a Java-based job
Compiling and running a Scala-based job
Running a Python-based job
Running a job interactively
Post-migration tasks
▶︎
Spark 2.3 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 2.3 to Spark 2.4 changes
Empty schema not supported
CSV header and schema match
Table properties support
Managed table location
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
Compiling and running Spark workloads
Post-migration tasks
▶︎
Apache Hive Expedited Migration Tasks
▶︎
Preparing tables for migration
Check SERDE Definitions and Availability
Handle Missing Table or Partition Locations
Managed Table Location Mapping
Make Tables SparkSQL Compatible
Creating a list of tables to migrate
Migrating tables to CDP
▶︎
Apache Hive Changes in CDP
Hive Configuration Property Changes
LOCATION and MANAGEDLOCATION clauses
▶︎
Handling table reference syntax
Add Backticks to Table References
▶︎
Key semantic changes and workarounds
Casting timestamps
Casting invalid dates
Changing incompatible column types
Understanding CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Handling the Keyword APPLICATION
Disabling Partition Type Checking
Dropping partitions
Handling output of greatest and least functions
Renaming tables
TRUNCATE TABLE on an external table
Hive unsupported interfaces and features
Changes to CDH Hive Tables
Changes to HDP Hive tables
▶︎
Apache Hive Post-Upgrade Tasks
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Configuring HiveServer for ETL using YARN queues
Removing Hive on Spark Configurations
Configuring authorization to tables
Making the Hive plugin for Ranger visible
Setting up access control lists
Configure encryption zone security
Configure edge nodes as gateways
Spark integration with Hive
Configure HiveServer HTTP mode
Configuring HMS for high availability
Installing Hive on Tez and adding a HiveServer role
▶︎
Updating Hive and Impala JDBC/ODBC drivers
Getting the JDBC driver
Getting the ODBC driver
▶︎
Apache Impala changes in CDP
Set ACLs for Impala
Impala Configuration Changes
Interoperability between Hive and Impala
Revert to CDH-like Tables
Authorization Provider for Impala
Data Governance Support by Atlas
Handling Data Files
▶︎
Hue post-upgrade tasks
Updating group permissions for Hive query editor
Adding Security Browser to the blocked list of applications
Adding Query Processor service to a cluster
Importing Sentry privileges into Ranger policies
Apache Knox - create plugin audit directory
Apache Ranger TLS Post-Upgrade Tasks
▶︎
Migrating ACLs from Key Trustee KMS to Ranger KMS
Key Trustee KMS operations not supported by Ranger KMS
ACLs supported by Ranger KMS and Ranger KMS Mapping
▶︎
Cloudera Search post-upgrade tasks
Bootstrapping Solr collections
▶︎
Recreating aliases
Upload aliases.json to the upgraded cluster
Reindexing Solr collections
Apache Hadoop YARN default value changes
Apache ZooKeeper ACLs: YARN
Upgrade Notes for Apache Kudu 1.12 / CDP 7.1
Apache HBase post-upgrade tasks
Configure SMM to monitor SRM replications
Configure SMM's service dependency on Schema Registry
Apache Sqoop Changes
Applications Upgrade
▶︎
HDP to CDP Private Cloud Base One Stage upgrade
In-Place Upgrade Overview
▶︎
Cluster environment readiness
Disk space and mountpoint considerations
Downloading and Publishing the Package Repository
Downloading and Publishing the Parcel Repository
Hadoop Users (user:group) and Kerberos Principals
▶︎
Upgrading the cluster's underlying OS
In-Place and Restore
Move and Decommission
Versions and supported services for migration
Software download matrix for HDP 3.1.5 and 2.6.5 to CDP 7.1.x
Sample data ingestion
Merge Independent Hive and Spark Catalogs
Cloudera Manager Installation and Setup
Installing Cloudera Management Service
Setting up CMA server
Registering Ambari Cloudera Manager pair for source cluster
Registering Ambari Cloudera Manager pair for target cluster
Preparing configurations
HDP to CDP Private Cloud Base Upgade
Execution steps
Troubleshooting
Backup HDP services from CDP 7.1.x
▶︎
Rollback HDP services from CDP 7.1.x
Automated rollback
▶︎
Manual rollback
Restore old configuration symlinks
Kerberos
ZooKeeper
Ambari Infra Solr
▶︎
Ranger
Restore Ranger Admin Database
Restore Ranger KMS Database
HDFS
YARN
HBase
Kafka
Atlas
Hive
Spark
Oozie
Knox
Zeppelin
▼
HDP3 to CDP Private Cloud Base Two Stage upgrade
▶︎
HDP to CDP Upgrade Overview
In-Place Upgrade Overview
CDP Upgrade Readiness
How much time should I plan for to complete my upgrade?
▶︎
Cluster environment readiness
Disk space and mountpoint considerations
Downloading and Publishing the Package Repository
Downloading and Publishing the Parcel Repository
Hadoop Users (user:group) and Kerberos Principals
Sample data ingestion
Merge Independent Hive and Spark Catalogs
▶︎
Ambari and HDP Upgrade Checklist
Ambari upgrade checklist
Download cluster blueprints without hosts
HDP upgrade checklist
Checklist for large clusters
Before upgrading any cluster
Managing MPacks
Changes to Ambari services and views
HDP Core component version changes
▶︎
Upgrading the cluster's underlying OS
In-Place and Restore
Move and Decommission
▶︎
Upgrading Ambari
Before you upgrade Ambari
Backup Ambari
▶︎
Setting up a local repository
Updating Ambari repo files
Updating HDP repo files
Case study for setting up an HDP-GPL local repository
Setting up local repository with temporary internet access
Case study for setting up local repository
Update version repository base urls
Preparing Ambari Repository Configuration File to use Local Repository
Preparing to Upgrade Ambari
Upgrade to Ambari 7.1.x.0
Download cluster blueprints
▶︎
Mandatory Post-Upgrade Tasks
Upgrading Ambari Infra
Upgrading Ambari Log Search
Upgrading Ambari Metrics
▶︎
Upgrading HDP to Cloudera Runtime 7.1.x
▶︎
HDP Prerequisites
Upgrade process
Before upgrading any cluster
▶︎
Backup HDP Cluster
Backup and Restore Databases
▶︎
Backup Ranger
Backup Ranger Admin Database
Backup Ranger KMS Database
▶︎
Backup Atlas
Backup HBase tables
Backup Ambari Infra Solr
Backup Ambari-Metrics
Backup Hive
Backup HBase
Backup Kafka
Backup Oozie
Backup Knox
Backup Logsearch
Backup Zeppelin
Backup HDFS
Backup ZooKeeper
▶︎
Backup databases
▶︎
Before you upgrade
Checkpoint HDFS
▶︎
Pre-upgrade steps
Ranger Service connection with Oracle database
Ranger admin password
Preparing Spark for upgrade
▶︎
Backing up Ambari infra Solr data
▶︎
Back up and upgrade Ambari infra Solr and Ambari Log Search
Generate migration configuration
Back up Ambari Infra Solr data
Remove Existing Collections and Upgrade Binaries
Preparing HBase for upgrade
Preparing the backend HMS database for upgrade
Turn off YARN GPU
▶︎
Preparing HDP Search for upgrade
Before you begin
Download Solr configuration from HDP Search ZooKeeper
▶︎
Transition Solr configuration
Cloudera Manager versions 7.1.1 to 7.2.4
Cloudera Manager versions 7.3.1 or higher
Validate the configuration
Test the configuration
Preparing ZooKeeper for upgrade
▶︎
Preparing Kafka for upgrade
Extract Kafka broker ID
▶︎
Register software repositories
HDP Intermediate bits for 7.1.x.0 Repositories
▶︎
Software download matrix for 3.1.5 to CDP 7.1.x
AM2CM legacy tools download
Install software on the hosts
▶︎
Perform the HDP upgrade
Perform express upgrade
▶︎
Post-HDP-upgrade tasks
Upload HDFS entity information
Ambari infra-migrate and restore
Ambari Metrics and LogSearch
Back up the Ranger configuration
Backup Infra Solr collections
▶︎
Troubleshooting the HDP upgrade
YARN Registry DNS instance fails to start
HDP 3.1.5 to HDP 7.1.7 Intermediate bits Kafka upgrade
Rollback Ambari 7.1.x to Ambari 2.7.5
▶︎
Rollback HDP Services 3.1.5 from CDP 7.1.x
Overview
ZooKeeper
Ambari-Metrics
Ambari Infra Solr
▶︎
Ranger
Restore Ranger Admin Database
Restore Ranger KMS Database
HDFS
YARN
HBase
Kafka
▶︎
Atlas
Restore HBase Tables
Restore ATLAS_ENTITY_AUDIT_EVENTS table
Restore Solr snapshots
Hive
Spark
Oozie
Knox
Zeppelin
Log Search
▼
Transitioning to Cloudera Manager
▶︎
Pre-transition steps
Databases
▶︎
Kerberos
Kerberos principal
▶︎
HDFS
Preparing HDFS
Backup the non-default Rack Awareness Topology script
▶︎
Spark
Spark2/Livy
Ranger
Solr
▶︎
Cloudera Manager Installation and Setup
Installing JDBC Driver
Proxy Cloudera Manager through Apache Knox
▶︎
Transitioning HDP to Cloudera Private Cloud Base
Transitioning HDP 3.1.5 cluster to CDP Private Cloud Base 7.1.x cluster using the AM2CM tool
▼
Post transition steps
Generating keytabs in Cloudera Manager
Enable Auto Start setting
ZooKeeper
Delete ZNodes
Ranger
Ranger KMS
Add Ranger policies for components on the CDP Cluster
Set maximum retention days for Ranger audits
Search post-HDP-upgrade tasks
▶︎
HDFS
Ports
TLS/SSL
HDFS HA
Custom Topology
Add Balancer Role to HDFS
Other review configurations for HDFS
Configuring HDFS properties to optimize log collection
▶︎
Solr
Restore Solr collections on CDP cluster
▶︎
Kafka
Change Kafka port value
Unsetting Kafka Protocol version
Impala
▶︎
YARN
Start job history
Yarn Mapreduce framework jars
GPU Scheduling
YARN CGroups
Reset ZNode ACLs
▶︎
Placement rules evaluation engine
Converting old mapping rule format to JSON-based placement rule format
Setting the owner and permissions of /user/yarn
▶︎
Spark
Livy2
Tez
▶︎
Hive
Identifying and fixing invalid Hive schema versions
Create HIVE sys database
Setting up Hive metastore for Atlas
HMS health check
HBase
▶︎
Hue
▶︎
Installing Python 3.8
Installing Python 3.8 on CentOS 7 for Hue
Installing Python 3.8 on RHEL 8 for Hue
Installing Python 3.8 on SLES 12 for Hue
Installing Python 3.8 on Ubuntu 18 for Hue
Installing the psycopg2 Python package for PostgreSQL database
Installing MySQL client for MySQL databases
Installing MySQL client for MariaDB databases
Ozone
▶︎
Oozie
Validate Database URL
Installing the new Shared Libraries
Update Oozie properties
Adding Oozie service dependencies
Access Oozie load balancer URL
Oozie Load Balancer configuration
Atlas advanced configuration snippet (Safety valve)
Migrating Atlas data
▶︎
Phoenix
Map Phoenix schemas to HBase namespaces
Starting all services
▶︎
Knox
Topology migration
Migrate Credential Aliases
Migrate signing key
Configure Apache Knox authentication for AD/LDAP
Client Configurations
Securing ZooKeeper
Zeppelin Shiro configurations
▶︎
Migrating Spark workloads to
▶︎
Spark 1.6 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 1.6 to Spark 2.4 changes
New Spark entry point SparkSession
Dataframe API registerTempTable deprecated
union replaces unionAll
Empty schema not supported
Referencing a corrupt JSON/CSV record
Dataset and DataFrame API explode deprecated
CSV header and schema match
Table properties support
CREATE OR REPLACE VIEW and ALTER VIEW not supported
Managed table location
Write to Hive bucketed tables
Rounding in arithmetic operations
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
▶︎
Compiling and running Spark workloads
Compiling and running a Java-based job
Compiling and running a Scala-based job
Running a Python-based job
Running a job interactively
Post-migration tasks
▶︎
Spark 2.3 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 2.3 to Spark 2.4 changes
Empty schema not supported
CSV header and schema match
Table properties support
Managed table location
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
Compiling and running Spark workloads
Post-migration tasks
▶︎
Apache Hive Changes in CDP
Hive Configuration Property Changes
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Removing the LLAP Queue
Configuring HiveServer for ETL using YARN queues
Configuring authorization to tables
▶︎
Updating Hive and Impala JDBC/ODBC drivers
Getting the JDBC driver
Getting the ODBC driver
Setting up access control lists
Configure encryption zone security
Renaming tables
Configure edge nodes as gateways
Configure HiveServer HTTP mode
Configuring HMS for high availability
Installing Hive on Tez and adding a HiveServer role
▶︎
Handling table reference syntax
Add Backticks to Table References
Unsupported Interfaces and Features
Changes to HDP Hive tables
Configuring External Authentication for Cloudera Manager
▶︎
Additional Services
▶︎
Installing DAS using Ambari
Check cluster configuration for Hive and Tez
Add the DAS service
▶︎
DAS post-installation tasks
Additional configuration tasks
Setting up the tmp directory
▶︎
Configuring DAS for SSL/TLS
Set up trusted CA certificate
Set up self-signed certificates
Configure SSL/TLS in Ambari
▶︎
Configuring user authentication in Ambari
Configuring user authentication using Knox SSO
Configuring user authentication using Knox proxy
Configuring user authentication using SPNEGO
Enabling logout option for secure clusters
▶︎
Troubleshooting DAS installation
▶︎
Problem area: Queries page
Your queries are not appearing on the Queries page
Query column is empty, yet you can see the DAG ID and Application ID
Query column is not empty, but you cannot see the DAG ID and Application ID
You cannot view queries from other users
▶︎
Problem area: Compose page
You cannot see your databases or the query editor is missing
▶︎
You cannot view new databases and tables, or cannot see changes to existing databases or tables
Replication failure in the DAS Event Processor
Problem area: Reports page
DAS service installation fails with the "python files missing" message
DAS does not log me out as expected, or I stay logged in longer than the time specified in the Ambari configuration
▶︎
Getting a 401 - Unauthorized access error message while accessing DAS
Setting up quick links for the DAS UI
Installing DAS using Cloudera Manager
▶︎
Adding Hue service with Cloudera Manager
Install and configure MySQL database
Add the Hue service using Cloudera Manager
Enable Kerberos for authentication
Integrate Hue with Knox
Grant Ranger permissions to new users or groups
Adding Query Processor service to a cluster
Applications Upgrade
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP1 to CDP Private Cloud Base 7.1.7
▶︎
HDP2 to CDP Private Cloud Base Two Stage upgrade
▶︎
HDP to CDP Upgrade Overview
In-Place Upgrade Overview
CDP Upgrade Readiness
How much time should I plan for to complete my upgrade?
▶︎
Cluster environment readiness
Disk space and mountpoint considerations
Downloading and Publishing the Package Repository
Downloading and Publishing the Parcel Repository
Hadoop Users (user:group) and Kerberos Principals
Sample data ingestion
▶︎
Expediting the Hive upgrade
▶︎
Overview of the expedited Hive upgrade
Handle Missing Table or Partition Locations
Managed Table Location Mapping
Check SERDE Definitions and Availability
Make Tables SparkSQL Compatible
Understanding the Hive upgrade
Modifying the HSMM to prevent migration
▶︎
Ambari and HDP Upgrade Checklist
Ambari upgrade checklist
Download cluster blueprints without hosts
HDP upgrade checklist
Checklist for large clusters
Kerberos cluster
Before upgrading any cluster
Managing MPacks
Changes to Ambari and HDP services
HDP Core component version changes
▶︎
Upgrading the cluster's underlying OS
In-Place and Restore
Move and Decommission
▶︎
Upgrading Ambari
Before you upgrade Ambari
▶︎
Setting up a local repository
Updating Ambari repo files
Updating HDP repo files
Case study for setting up an HDP-GPL local repository
Setting up local repository with temporary internet access
Case study for setting up local repository
Update version repository base urls
Preparing Ambari Repository Configuration File to use Local Repository
Backup Ambari
Ambari Behavioral changes
Ambari Properties backup
Review Ambari UI and the Quick Links
Upgrade to Ambari 7.1.x.0
Download cluster blueprints
▶︎
Mandatory Post-Upgrade Tasks
▶︎
Upgrading Ambari Metrics System and SmartSense
Upgrading Ambari Metrics
Backup Ambari-Metrics
Upgrading SmartSense
▶︎
Upgrading HDP to Cloudera Runtime 7.1.x
▶︎
HDP Prerequisites
Upgrade process
Kerberos cluster
Before upgrading any cluster
▶︎
Backup HDP Cluster
Backup and Restore Databases
▶︎
Backup Ranger
Backup Ranger Admin Database
Backup Ranger KMS Database
▶︎
Backup Atlas
Backup HBase tables
Backup Ambari Infra Solr
Backup Hive
Backup HBase
Backup Kafka
Backup Oozie
Backup Knox
Backup Logsearch
Backup Zeppelin
Backup HDFS
Backup ZooKeeper
Backup Ambari
▶︎
Backup databases
▶︎
Before you upgrade
Checkpoint HDFS
▶︎
Register software repositories
HDP Intermediate bits for 7.1.x.0 Repositories
▶︎
Software download matrix for HDP 2.6.5 to CDP 7.1.x
AM2CM legacy tools download
Install software on the hosts
▶︎
Preparing the services for upgrade
▶︎
Backing up Ambari infra data
▶︎
Back up and upgrade Ambari infra and Ambari Log Search
Generate migration configuration
Back up Ambari Infra Solr data
Remove Existing Collections and Upgrade Binaries
Upgrading Ambari Infra
▶︎
Overview of the Migration of the Atlas and Infra Solr Data
Preparing Atlas for upgrade
Place Atlas in migration mode
Ranger Service connection with Oracle database
Ranger admin password
▶︎
Preparing HBase for upgrade
Remove PREFIX_TREE data block encoding
Validate HFile
▶︎
Preparing HDP Search for upgrade
Before you begin
Download Solr configuration from HDP Search ZooKeeper
▶︎
Transition Solr configuration
Cloudera Manager versions 7.1.1 to 7.2.4
Cloudera Manager versions 7.3.1 or higher
Validate the configuration
Test the configuration
▶︎
Preparing Hive for upgrade
Take a Mandatory Snapshot of Hive Tables
Make Tables SparkSQL Compatible
Download the Pre-Upgrade Tool JAR for Compaction
Get a Kerberos Ticket
Run Compaction on Hive Tables
Save Hive Metastore by Dumping
Capture Information about Multiple HiveServers
Hive Pre-Upgrade Tool Command Help
Preparing the backend HMS database for upgrade
▶︎
Preparing Kafka for upgrade
Extract Kafka broker ID
Preparing Zeppelin for upgrade
▶︎
Perform the HDP upgrade
Perform express upgrade
▶︎
Post-HDP-upgrade tasks
Update Ranger passwords
Atlas Migration and HBase Hook settings
Ambari Metrics and LogSearch
Ambari infra-migrate and restore
Upload HDFS entity information
Custom Spark SQL Warehouse Directory
▶︎
Hive post-HDP-upgrade tasks
Checking and correcting Hive table locations
Preventing SparkSQL incompatibility
Correct Hive File Locations
Handle Missing Table or Partition Locations
Managed Table Location Mapping
Check SERDE Definitions and Availability
Verify Zeppelin settings in Ambari
Search post-HDP-upgrade tasks
Backup Infra Solr collections
▶︎
Troubleshooting the HDP upgrade
Hive Metastore corrupt
Missing Hive tables
YARN Registry DNS instance fails to start
Ambari Metrics System (AMS) does not start
Ranger MySQL collation
Rollback Ambari to 2.6.5
▶︎
Rollback HDP Services
Overview
ZooKeeper
Ambari-Metrics
Ambari Infra Solr
▶︎
Ranger
Restore Ranger Admin Database
Restore Ranger KMS Database
HDFS
YARN
HBase
Kafka
▶︎
Atlas
Restore HBase Tables
Restore ATLAS_ENTITY_AUDIT_EVENTS table
Hive
Spark
Oozie
Knox
Zeppelin
Log Search
▶︎
Transitioning to Cloudera Manager
▶︎
Pre-transition steps
Databases
▶︎
Kerberos
Kerberos principal
Atlas migration
▶︎
HDFS
Preparing HDFS
Backup the non-default Rack Awareness Topology script
▶︎
Spark
Spark2/Livy
Ranger
Kerberos - Optional task
Solr
▶︎
Cloudera Manager Installation and Setup
Installing JDBC Driver
Proxy Cloudera Manager through Apache Knox
▶︎
Transitioning HDP to Cloudera Private Cloud Base
Transitioning HDP 2.6.5 cluster to CDP Private Cloud Base 7.1.x cluster using the AM2CM tool
▶︎
Post transition steps
Enable Auto Start setting
Kerberos Principal for Cloudera Manager Server
ZooKeeper
Delete ZNODES
Ranger
Ranger KMS
Add Ranger policies for components on the CDP Cluster
▶︎
Ranger Installation in High Availability with Load Balancer
Create composite keytab for Ranger HA
Set maximum retention days for Ranger audits
▶︎
HDFS
Ports
TLS/SSL
HDFS HA
Custom Topology
Add Balancer Role to HDFS
Other review configurations for HDFS
Configuring HDFS properties to optimize log collection
▶︎
Solr
Restore Solr collections on CDP cluster
▶︎
Kafka
Change Kafka port value
Kafka cluster Kerberos
Unsetting Kafka Protocol version
▶︎
YARN
Start job history
Yarn Mapreduce framework jars
YARN NodeManager
YARN NodeManager CGroups
Reset ZNode ACLs
▶︎
Placement rules evaluation engine
Converting old mapping rule format to JSON-based placement rule format
YARN owner permission
YARN mapreduce paramater
▶︎
Spark
Livy2
Enabling Spark on YARN for Atlas
Enabling SAC manually on Spark
Tez
▶︎
Hive
Setting up Hive metastore for Atlas
Identifying and fixing invalid Hive schema versions
Fixing statistics
Advanced configuration snippet (Safety valve)
Remove Hive Ranger property
HBase
HBase RegionServer heap size
▶︎
Hue
▶︎
Installing Python 3.8
Installing Python 3.8 on CentOS 7 for Hue
Installing Python 3.8 on RHEL 8 for Hue
Installing Python 3.8 on SLES 12 for Hue
Installing Python 3.8 on Ubuntu 18 for Hue
Installing the psycopg2 Python package for PostgreSQL database
Installing MySQL client for MySQL databases
Installing MySQL client for MariaDB databases
▶︎
Oozie
Validate Database URL
Installing the new Shared Libraries
Update Oozie properties
Access Oozie load balancer URL
Oozie Load Balancer configuration
Atlas advanced configuration snippet (Safety valve)
Migrating Atlas data
▶︎
Phoenix
Map Phoenix schemas to HBase namespaces
Starting all services
Hive Policy Additions
▶︎
Knox
Topology migration
Migrate Credential Aliases
Migrate signing key
Configure Apache Knox authentication for AD/LDAP
Client Configurations
Securing ZooKeeper
Zeppelin Shiro configurations
▶︎
Migrating Spark workloads to
▶︎
Spark 1.6 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 1.6 to Spark 2.4 changes
New Spark entry point SparkSession
Dataframe API registerTempTable deprecated
union replaces unionAll
Empty schema not supported
Referencing a corrupt JSON/CSV record
Dataset and DataFrame API explode deprecated
CSV header and schema match
Table properties support
CREATE OR REPLACE VIEW and ALTER VIEW not supported
Managed table location
Write to Hive bucketed tables
Rounding in arithmetic operations
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
▶︎
Compiling and running Spark workloads
Compiling and running a Java-based job
Compiling and running a Scala-based job
Running a Python-based job
Running a job interactively
Post-migration tasks
▶︎
Spark 2.3 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 2.3 to Spark 2.4 changes
Empty schema not supported
CSV header and schema match
Table properties support
Managed table location
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
Compiling and running Spark workloads
Post-migration tasks
▶︎
Apache Hive Expedited Migration Tasks
Preparing tables for migration
Creating a list of tables to migrate
Migrating tables to CDP
▶︎
Apache Hive Changes in CDP
Hive Configuration Property Changes
▶︎
Key syntax changes
▶︎
Handling table reference syntax
Add Backticks to Table References
LOCATION and MANAGEDLOCATION clauses
▶︎
Key semantic changes and workarounds
Casting timestamps
Casting invalid dates
Changing incompatible column types
Understanding CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Disabling Partition Type Checking
Dropping partitions
Handling output of greatest and least functions
Renaming tables
TRUNCATE TABLE on an external table
Hive unsupported interfaces and features
Changes to CDH Hive Tables
Changes to HDP Hive tables
▶︎
Apache Hive Post-Upgrade Tasks
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Removing the LLAP Queue
Configuring HiveServer for ETL using YARN queues
Removing Hive on Spark Configurations
Configuring authorization to tables
Setting up access control lists
Configure encryption zone security
Configure edge nodes as gateways
Spark integration with Hive
Configure HiveServer HTTP mode
Configuring HMS for high availability
Installing Hive on Tez and adding a HiveServer role
▶︎
Updating Hive and Impala JDBC/ODBC drivers
Getting the JDBC driver
Getting the ODBC driver
Configuring External Authentication for Cloudera Manager
▶︎
Additional Services
▶︎
Installing DAS using Ambari
Check cluster configuration for Hive and Tez
Add the DAS service
▶︎
DAS post-installation tasks
Additional configuration tasks
Setting up the tmp directory
▶︎
Configuring DAS for SSL/TLS
Set up trusted CA certificate
Set up self-signed certificates
Configure SSL/TLS in Ambari
▶︎
Configuring user authentication in Ambari
Configuring user authentication using Knox SSO
Configuring user authentication using Knox proxy
Configuring user authentication using SPNEGO
Enabling logout option for secure clusters
▶︎
Troubleshooting DAS installation
▶︎
Problem area: Queries page
Your queries are not appearing on the Queries page
Query column is empty, yet you can see the DAG ID and Application ID
Query column is not empty, but you cannot see the DAG ID and Application ID
You cannot view queries from other users
▶︎
Problem area: Compose page
You cannot see your databases or the query editor is missing
▶︎
You cannot view new databases and tables, or cannot see changes to existing databases or tables
Replication failure in the DAS Event Processor
Problem area: Reports page
DAS service installation fails with the "python files missing" message
DAS does not log me out as expected, or I stay logged in longer than the time specified in the Ambari configuration
▶︎
Getting a 401 - Unauthorized access error message while accessing DAS
Setting up quick links for the DAS UI
Installing DAS using Cloudera Manager
▶︎
Adding Hue service with Cloudera Manager
Install and configure MySQL database
Add the Hue service using Cloudera Manager
Enable Kerberos for authentication
Integrate Hue with Knox
Grant Ranger permissions to new users or groups
Adding Query Processor service to a cluster
Applications Upgrade
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP1 to CDP Private Cloud Base 7.1.7
▶︎
Migrating Workloads
▶︎
Data Migration Tools and Methods for Cloudera Private Cloud
▶︎
Use Replication Manager to migrate to Cloudera Base on premises
Replication Manager in Cloudera Base on premises
Replication Manager
Port and network requirements for Replication Manager on Cloudera Base on premises
Migrating data to Cloudera Base on premises from CDH using Replication Manager
▶︎
Prepare to replicate using replication policies
Cloudera license requirements for Replication Manager
Configuring SSL/TLS certificate exchange between two Cloudera Manager instances
▶︎
Add source cluster as peer to use in replication policies
Adding a peer to use in replication policy
Modifying peers to use in replication policy
Configuring peers with SAML authentication
▶︎
Enabling replication between clusters with Kerberos authentication
Required ports in Kerberos authentication-enabled clusters for replication
Considerations for realm names to use for replication
Preparing Kerberos authentication-enabled clusters for replication
Kerberos connectivity test
Replicating from unsecure to secure clusters
▶︎
Replication of encrypted data
Encrypting data in transit between clusters
Security considerations for encrypted data during replication
Configuring heap size to replicate large directories using replication policies
Retaining logs for Replication Manager
▶︎
Atlas replication policies (technical preview)
Preparing to create Atlas replication policies
Creating Atlas replication policies
Manage, monitor, and troubleshoot Atlas replication policies
▶︎
HDFS replication policies
▶︎
HDFS replication policy considerations
How HDFS replication policy works
Improve network latency during replication job run
Performance and scalability limitations to consider for replication policies
Guidelines to use snapshot diff-based replication
HDFS replication in Sentry-enabled clusters
Specifying hosts to improve HDFS replication policy performance
Creating HDFS replication policy to replicate HDFS data
How to use the post copy reconciliation script for HDFS replication policies
View HDFS replication policy details
View historical details for an HDFS replication policy
Monitoring the performance of HDFS replication policies
▶︎
Hive external table replication policies
▶︎
Hive replication policy considerations
Specifying hosts to improve Hive replication policy performance
Hive tables and DDL commands
Disabling replication of parameters during Hive replication
Accommodate HMS changes for Hive replication policies
Creating a Hive external table replication policy
Sentry to Ranger replication for Hive external tables
Importing Sentry privileges into Ranger policies
Replicating data to Impala clusters
Replicate Impala and Hive User Defined Functions (UDFs)
Monitoring the performance of Hive/Impala replication policies
▶︎
Hive ACID table replication policies
▶︎
Prepare to create Hive ACID table replication policies
Configure two-way trust between clusters
▶︎
Configure parameters for Hive ACID table replication policies
Advanced Hive configuration parameters for Hive ACID table replication policies
Recommended Hive configuration parameters for Hive ACID table replication policies
Parameters to optimize Hive ACID table replication performance
Configure file access control lists for Impala user
Creating Hive ACID table replication policy
Managing Hive ACID table replication policies
Troubleshooting Hive ACID table replication policies
▶︎
Iceberg replication policies
How Iceberg replication policy works
Preparing to create Iceberg replication policies
Creating Iceberg replication policy
Manage and monitor Iceberg replication policies
▶︎
Ozone replication policies
▶︎
Preparing clusters to replicate Ozone data
Configuring properties for OBS bucket replication using Ozone replication policies
Creating Ozone replication policies
Managing Ozone replication policies
▶︎
Ranger replication policies
How Ranger replication policy works
Preparing clusters for Ranger replication policy creation
Creating Ranger replication policies
Managing Ranger replication policies
Managing replication policies
Troubleshooting replication policies between on-premises clusters
▶︎
Snapshots and snapshot policies
How Replication Manager uses snapshots
Creating snapshot policies in Replication Manager
Manage and monitor snapshot policies
Troubleshooting snapshot policies in Replication Manager
Restoring HDFS snapshots in Cloudera Manager
Restoring Ozone snapshots in Cloudera Manager
▶︎
Managing HDFS snapshots in Cloudera Manager
Browsing HDFS directories
Enabling and disabling HDFS snapshots
Taking and deleting HDFS snapshots
Restoring HDFS snapshots in Cloudera Manager
▶︎
Migrating Data Science Workbench to Cloudera AI
▶︎
Migrating Data Science Workbench (CDSW) to Cloudera AI
Prerequisites for CDSW to Cloudera AI migration
▶︎
Repurposing CDSW nodes for Cloudera AI
Removing cluster hosts and Kubernetes nodes
Customizing CDSW for migrating host mounts
Using the CDSW to Cloudera AI Migration tool
Troubleshooting preflight migration check issues
Limitations of migrating CDSW to Cloudera AI
Post-migration tasks
Troubleshooting CDSW migration to Cloudera AI
▶︎
Migrating Fair Scheduler to Capacity Scheduler for Cloudera Private Cloud
Scheduler migration overview
▶︎
Planning your scheduler migration
Scheduler migration limitations
Placement rules migration
Auto-converted Fair Scheduler properties
Fair Scheduler features and conversion details
▶︎
Migrating scheduler using the fs2cs conversion utility
CLI options of the fs2cs conversion tool
▶︎
Manual configuration of scheduler properties
Using YARN Queue Manager UI to configure scheduler properties
Using Cloudera Manager Safety Valves to configure scheduler properties
▶︎
Migrating Spark Data to Cloudera Private Cloud
▶︎
Migrating Spark workloads to Cloudera
▶︎
Spark 1.6 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 1.6 to Spark 2.4 changes
New Spark entry point SparkSession
Dataframe API registerTempTable deprecated
union replaces unionAll
Empty schema not supported
Referencing a corrupt JSON/CSV record
Dataset and DataFrame API explode deprecated
CSV header and schema match
Table properties support
Managed table location
Write to Hive bucketed tables
Rounding in arithmetic operations
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
▶︎
Compiling and running Spark workloads
Compiling and running a Java-based job
Compiling and running a Scala-based job
Running a Python-based job
Running a job interactively
Post-migration tasks
▶︎
Spark 2.3 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 2.3 to Spark 2.4 changes
Empty schema not supported
CSV header and schema match
Table properties support
Managed table location
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
Compiling and running Spark workloads
Post-migration tasks
Spark 2.4 to Spark 3.2 Refactoring
▶︎
Migrating Spark CDP to Cloudera Data Engineering
Cloudera Data Engineering Concepts
Convert Spark Submit commands to CDE CLI Spark Submit commands
Using the Cloudera Data Engineering CLI
Convert Spark Submits to CDE API Requests
Using Swagger Page
Getting Started with CDE Airflow
Using Airflow
Using spark-submit drop-in migration tool for migrating Spark workloads to CDE
▶︎
Migrating Hive Workloads to Cloudera Private Cloud
▶︎
Migrating Hive workloads from CDH
Changes to CDH Hive Tables
▶︎
Configuration changes
Hive Configuration Property Changes
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Configuring HMS for high availability
Setting up Hive metastore for Atlas
Changing the Hive warehouse location
▶︎
Security tasks
Making the Hive plugin for Ranger visible
Configuring authorization to tables
Setting up access control lists
Configure encryption zone security
Configure edge nodes as gateways
Configure HiveServer HTTP mode
▶︎
Key syntax changes
Handling table reference syntax
LOCATION and MANAGEDLOCATION clauses
▶︎
Key semantic changes and workarounds
Changing incompatible column types
Understanding CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Dropping partitions
Handling the Keyword APPLICATION
Handling output of greatest and least functions
Renaming tables
TRUNCATE TABLE on an external table
▶︎
Other syntax and semantic changes
▶︎
Syntax and semantic changes CDH 6.2.1 to CDP 7.0.3.2
Aliasing tables
ANALYZE TABLE ... COMPUTE STATISTICS PARTIALSCAN removed
Decimal to string change
Decimal literals
hive.stats.collect.rawdatasize removal
HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
Limit scanned partitions
Overflow handling of decimals
▶︎
Functions that changed
ACOS(2) and ASIN(2) return NULL
CAST function results
Casting types with leading or trailing spaces
CORR and COVAR_SAMP compliant with SQL:2011
LENGTH function supported data types
STDDEV_SAMP and VAR_SAMP
▶︎
NULL related behaviors
ORDER BY clause treatment of NULLs
Disallow enabling/enforcing NOT NULL
Default NULL ordering change
Enforcement of NOT NULL constraint
▶︎
Timestamp or date related behaviors
ADD_MONTHS function fix
ADD_MONTHS date validation
Casting invalid dates
FROM_UNIXTIME and UNIX_TIMESTAMP time zone
Handling of CURRENT_TIMESTAMP output format
Handling of Julian dates in UDFs
Handling return type for old date functions
Support for SQL:2016 datetime formats (limited formats)
UNIX_TIMESTAMP behavior
TIMESTAMP based on UTC
UNIX_TIMESTAMP conversion of TIMESTAMPLOCALTZ
▶︎
Semantic changes and workarounds CDP 7.1.1
NVL UDF implementation changes
Improved Handling of External Table Inserts in HDFS
▶︎
Semantic changes and workarounds CDP 7.1.4
Exclusive write lock for MERGE INSERT
Lock implementations to allow zero-wait readers
UNBOUNDED representation in Window functions
Support for 0 ROWS PRECEDING or FOLLOWING
▶︎
Semantic changes and workarounds CDP 7.1.5
Sort behavior in SHOW COLUMNS
Event notification cleanup interval
▶︎
Semantic changes and workarounds CDP 7.1.6
Support for SQL:2016 datetime formats (text, FM, FX)
Casting Timestamp to numeric and vice-versa
Handling trailing zeros of decimal constants
▶︎
Semantic changes and workarounds CDP 7.1.7
Precision and scale changes
▶︎
Semantic changes and workarounds CDP 7.1.7 SP1
Date and timestamp parser changes from LENIENT to STRICT
Date strings are parsed using local timezone
▶︎
Semantic changes and workarounds CDP 7.1.7 SP2
Date and timestamp format changes
▶︎
Semantic changes and workarounds CDP 7.1.7 SP2 CHFx
New property to control datetime formatter
Dates are parsed by ignoring trailing invalid characters
▶︎
Semantic changes and workarounds CDP 7.1.8 CHFx
Handling table column named default
Fix precision and scale inference for aggregate rewriting in Calcite
▶︎
Migrating Spark Apps
Preventing SparkSQL incompatibility
Spark integration with Hive
Removing Hive on Spark Configurations
Disabling Partition Type Checking
Converting Hive CLI scripts to Beeline
Hive unsupported interfaces and features
▶︎
Migrating Hive workloads from HDP 2.6.5
Changes to HDP Hive tables
Checking and correcting Hive table locations
▶︎
Configuration changes
Hive Configuration Property Changes
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Configuring HMS for high availability
Setting up Hive metastore for Atlas
Changing the Hive warehouse location
Removing the LLAP Queue
▶︎
Security tasks
Making the Hive plugin for Ranger visible
Configuring authorization to tables
Setting up access control lists
Configure encryption zone security
Configure edge nodes as gateways
Configure HiveServer HTTP mode
▶︎
Handling syntax changes
Handling table reference syntax
LOCATION and MANAGEDLOCATION clauses
▶︎
Key semantic changes and workarounds
Casting timestamps
Changing incompatible column types
Understanding CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Dropping partitions
Handling output of greatest and least functions
Renaming tables
TRUNCATE TABLE on an external table
▶︎
Migrating Spark Apps
Spark integration with Hive
Identifying and fixing invalid Hive schema versions
Fixing statistics
Converting Hive CLI scripts to Beeline
Hive unsupported interfaces and features
▶︎
Replicating Hive data from HDP 3 to CDP
Replicating Hive data
▶︎
Configuring the CDP cluster
Mandatory CDP policy-level properties
Optional CDP policy-level properties
Supported scheduled query operations
▶︎
Configuring the HDP cluster
Mandatory HDP cluster configuration properties
Mandatory HDP policy-level properties
Optional HDP policy-level properties
Configuring wire-encrypted clusters
Example commands for replicating HDP 3 workloads
Troubleshooting Hive replication using REPL
Repl Command Known Issues
Patches Required on HDP
Patches required on CDP
▶︎
Verifying the Hive data replication
Setting up the HDP cluster
Verifying replication
Handing a failed verification
Validating external table replication
Enabling background threads after migration
▶︎
Migration paths from HDP 3 to CDP for LLAP users
▶︎
Migration paths for Hive users
Migration to Cloudera Private Cloud Base or CDP Public Cloud
Migration to Cloudera Data Warehouse
Apache Tez processing of Hive jobs
▶︎
Migration paths for Spark users
Migration to Cloudera Private Cloud Base
HWC changes from HDP to CDP
▶︎
Migrating Hive workloads from Cloudera Base on premises to Cloudera Data Warehouse on premises
Planning a Cloudera Data Warehouse Virtual Warehouse instance
Apache Tez processing of Hive jobs
Migrate Hive workloads from HDP (LLAP) to Cloudera Data Warehouse (LLAP)
Migrate from Cloudera Base on premises (Hive on Tez) to Cloudera Data Warehouse (LLAP)
▶︎
Migrating Hive workloads to ACID
Tables in Hive 1 and 2 vs. Hive 3
Compatible storage formats
Table design considerations
Hive ingest patterns introduction
Classic ingest patterns
ACID ingest patterns
Handling government regulations in ACID tables
Key concepts about ACID ingest patterns
▶︎
Migrating Impala Workloads to Cloudera Private Cloud
Overview
▶︎
CDH/HDP to Cloudera Base on premises
▶︎
Migration options
Side-car migration
In-place upgrade with new nodes for Cloudera Data Warehouse Data Service
▶︎
Moving Impala compute workloads from Cloudera Base on premises to Cloudera Data Warehouse Data Service on premises
Workload selection
Steps for migrating Impala workloads to Cloudera Data Warehouse Data Service on premises
▶︎
Reference information
Guidelines for moving repeated BI workloads
Configuration options available in Cloudera Data Warehouse by default
Supported authentication
Setting up ODBC connection from a BI tool
▶︎
Migrating Impala Data to Cloudera Private Cloud
Migrating Impala Data Overview
▶︎
Impala Changes between CDH and CDP
Change location of Datafiles
Set Storage Engine ACLs
Automatic Invalidation/Refresh of Metadata
Metadata Improvements
Default Managed Tables
Automatic Refresh of Tables on Impala Clusters
Interoperability between Hive and Impala
ORC Support Disabled for Full-Transactional Tables
Authorization Provider for Impala
Data Governance Support by Atlas
▶︎
Impala configuration differences in CDH and CDP
Default File Formats
Reconnect to HS2 Session
Automatic Row Count Estimation
Using Reserved Words in SQL Queries
Other Miscellaneous Changes in Impala
Factors to Consider for Capacity Planning
Planning Capacity Using WXM
Performance Differences between CDH and CDP
▶︎
Migrating Kudu Data to Cloudera Private Cloud
▶︎
Kudu migration overview
Backing up data in Kudu
Applying custom Kudu configuration
Copying the backed up data to the Cloudera cluster
Restoring Kudu data into the target Cloudera cluster
▶︎
Migrating Navigator content to Atlas
Migrating Navigator content to Atlas
Migrated features and limitations
▶︎
Migration assumptions and prerequisites
Estimating migration time and resources required
Understanding migrated Navigator entites
▶︎
How the migration process works
Installing Atlas using the Cloudera Manager upgrade wizard
▶︎
Migrating Navigator data using customized scripts
Run the extraction
Run the transformation
Run the import
Validate the migration
Move Atlas out of migration mode
▶︎
Mapping Navigator metadata to Atlas
Migrating Navigator audit information into Atlas
Migrate audit reports and processes to Ranger
What is new in Atlas for Navigator Users
▶︎
Migrating Navigator to Atlas for Cloudera Private Cloud with Cloudera Manager 6
About migrating from Cloudera Navigator to Atlas using Cloudera Manager 6
▶︎
Migrating Navigator when upgrading Cloudera Manager 6 with CDH 6 clusters
Installing the cnav.sh and dependencies
Run the extraction for CDH 6 with Cloudera Manager 6
Removing Navigator role instances
▶︎
Post upgrade operations for your CDH 6 clusters
Run the transformation for CDH 6 with Cloudera Manager 6
Run the import for CDH 6 with Cloudera Manager 6
Moving Atlas out of migration mode
▶︎
Migrating Navigator to Atlas for Cloudera Private Cloud with Cloudera Manager 7
▶︎
About migrating from Cloudera Navigator to Atlas using Cloudera Manager 7
▶︎
Migrating Navigator when upgrading Cloudera Manager 7 with CDH 6 to Cloudera Runtime 7.1.9
Estimating the time and resources needed for transition
Run the extraction for CDH 6 with Cloudera Manager 7
Removing Navigator role instances
▶︎
Post upgrade operations for your CDH 6 clusters
Run the transformation for CDH 6 with Cloudera Manager 7
Run the import for CDH 6 with Cloudera Manager 7
Moving Atlas out of migration mode
▶︎
Migrating Operational Database to Cloudera Private Cloud
▶︎
Migrating HBase
▶︎
Preparing for data migration
Removing PREFIX_TREE Data Block Encoding
Checking co-processor classes
Migrating HBase from CDH or HDP
Verifying and validating if your data is migrated
HBase unsupported features
▶︎
Migrating Accumulo to Cloudera
Migrating to Operational Database
▶︎
Migrating to Accumulo 1.10.0
In-place data upgrade from Accumulo 1.7.0 in HDP 2 to Accumulo 1.10
In-place data upgrade from Accumulo 1.7.0 in HDP 3 to Accumulo 1.10
In-place data upgrade from Accumulo 1.7.2 in CDH 5 to Accumulo 1.10
In-place data upgrade from Accumulo 1.9.2 in CDH 6 to Accumulo 1.10
In-place data upgrade from CDH 6 to Operational Database
▶︎
Migrating Oozie Configurations to Cloudera Private Cloud
▶︎
Migrating Oozie configurations to Cloudera
Migrating Oozie Shared Libraries
▶︎
Migrating Streaming Workloads to Cloudera Private Cloud
▶︎
Migrating Streaming workloads from HDF to Cloudera Private Cloud Base
Set Up a New Streaming Cluster in Cloudera Private Cloud Base
Migrate Ranger Policies
▶︎
Migrate Schema Registry
Copy Raw Data
Reuse Existing Storage
▶︎
Migrate Streams Messaging Manager
Copy Raw Data
Reuse Existing Database
▶︎
Migrate Kafka Using Streams Replication Manager
Migrate Kafka Using the DefaultReplicationPolicy
Migrate Kafka Using the IdentityReplicationPolicy
Migrate Kafka Using the MigratingReplicationPolicy
▶︎
Migrating Sentry to Ranger for Cloudera Private Cloud
Migrating from Sentry to Ranger
Consolidating policies created by Authzmigrator
Customizing the authorization-migration-site.xml file
Check MySQL isolation configuration
Ranger policies allowing create privilege for Hadoop_SQL databases
Ranger policies allowing create privilege for Hadoop_SQL tables
Access required to Read/Write on Hadoop_SQL tables using SQL
Mapping Sentry permissions for Solr to Ranger policies
Configuring Ranger ACL Sync
Use authzmigrator tool to migrate Hive and Kafka permissions to Ranger
▶︎
Authzmigrator tool
Exporting Permissions from Sentry Server
Step 2: Ingesting permissions into Ranger
▶︎
Migrating HDFS data to Ozone for Cloudera Private Cloud
▶︎
Migrating your data from HDFS to Ozone
▶︎
Considerations for transferring data from HDFS to Ozone
HDFS dependency on running Ozone
Roles and sizing considerations for Ozone
Ozone namespace concepts
▶︎
Ozone configurations
Adding Core Configuration service for Ozone
Permission models on Ozone
HDFS namespace mapping to Ozone
▶︎
Process of migrating the HDFS data to Ozone
Verifying the migration prerequisites
Preparing Ozone for data ingestion
Creating additional Ranger policies for the keyadmin user
Moving data from HDFS to Ozone using the distcp command
Validating the migrated data
Cleaning up data on the HDFS source
(Optional) Configure other services to work with Ozone
▶︎
Sidecar migration from HDP to Cloudera
Sidecar migration from HDP to CDP
Pre-migration steps
▶︎
Migrating Services
Atlas
Ranger
HDFS
YARN Capacity Scheduler
HBase
Solr
Oozie
Accumulo
▶︎
Post-migration steps
HDF Streaming Workloads
Test the replicated data
▶︎
Sidecar migration from CDH to Cloudera
Sidecar migration from CDH to CDP
Pre-migration steps
▶︎
Migrating Services
Sentry to Ranger
Navigator to Atlas
HDFS
Hive
YARN Fair Scheduler to Capacity Scheduler
HBase
Impala
Kudu
Solr
Oozie
Accumulo
▶︎
Post-migration steps
Flow management
Testing applications
▶︎
Migrating keys from Key Trustee Server to Ranger KMS
Migrating keys from Key Trustee Server to Ranger KMS
Key migration
▶︎
Key migration in UCL
Navigator Encrypt key migration with HSM
Updating Navigator Encrypt
Rollback of Ranger KMS DB to KTS
▶︎
Cloudera AI Project Migration
▶︎
Migrating Projects
Troubleshooting: Using custom SSL certificates to connect to Cloudera AI Workbenches
Troubleshooting: Retrying a successful or failed migration
Troubleshooting: Running the tool as a background process
Project Migration FAQs
(Optional) Configure other services to work with Ozone
(Optional) Start Cloudera Manager Server & Cloudera Management Service
(Optional) Start Cloudera Manager Server & Cloudera Management Service
(Optional) Start Cloudera Manager Server & Cloudera Management Service
(Optional) Start Cloudera Manager Server & Cloudera Management Service
(Optional) Start Cloudera Manager Server & Cloudera Management Service
(Optional) Start Cloudera Manager Server & Cloudera Management Service
About migrating from Cloudera Navigator to Atlas using Cloudera Manager 6
About migrating from Cloudera Navigator to Atlas using Cloudera Manager 7
About Solr configuration transformation script
About using this online Upgrade Guide
About using this online Upgrade Guide
About using this online Upgrade Guide
Access Oozie load balancer URL
Access Oozie load balancer URL
Access required to Read/Write on Hadoop_SQL tables using SQL
Accommodate HMS changes for Hive replication policies
Accumulo
Accumulo
ACID ingest patterns
ACLs supported by Ranger KMS and Ranger KMS Mapping
ACLs supported by Ranger KMS and Ranger KMS Mapping
ACOS(2) and ASIN(2) return NULL
Add Backticks to Table References
Add Backticks to Table References
Add Backticks to Table References
Add Backticks to Table References
Add Backticks to Table References
Add Balancer Role to HDFS
Add Balancer Role to HDFS
Add Ranger policies for components on the CDP Cluster
Add Ranger policies for components on the CDP Cluster
Add source cluster as peer to use in replication policies
Add the DAS service
Add the DAS service
Add the Hue service using Cloudera Manager
Add the Hue service using Cloudera Manager
Adding a peer to use in replication policy
Adding Core Configuration service for Ozone
Adding Hue service with Cloudera Manager
Adding Hue service with Cloudera Manager
Adding Oozie service dependencies
Adding Query Processor service to a cluster
Adding Query Processor service to a cluster
Adding Query Processor service to a cluster
Adding Query Processor service to a cluster
Adding Security Browser to the blocked list of applications
Adding Security Browser to the blocked list of applications
Additional configuration tasks
Additional configuration tasks
Additional Services
Additional Services
ADD_MONTHS date validation
ADD_MONTHS function fix
Advanced configuration snippet (Safety valve)
Advanced Hive configuration parameters for Hive ACID table replication policies
Agent Hosts
Alert Publisher
Aliasing tables
AM2CM legacy tools download
AM2CM legacy tools download
Ambari and HDP Upgrade Checklist
Ambari and HDP Upgrade Checklist
Ambari Behavioral changes
Ambari Infra Solr
Ambari Infra Solr
Ambari Infra Solr
Ambari infra-migrate and restore
Ambari infra-migrate and restore
Ambari Metrics and LogSearch
Ambari Metrics and LogSearch
Ambari Metrics System (AMS) does not start
Ambari Properties backup
Ambari upgrade checklist
Ambari upgrade checklist
Ambari-Metrics
Ambari-Metrics
ANALYZE TABLE ... COMPUTE STATISTICS PARTIALSCAN removed
Apache Hadoop YARN default value changes
Apache Hadoop YARN default value changes
Apache HBase post-upgrade tasks
Apache HBase post-upgrade tasks
Apache Hive 3 architectural overview Apache Hive 3 in Cloudera Data Hub architectural overview Apache Hive 3 in Cloudera Data Warehouse architectural overview
Apache Hive Changes in CDP
Apache Hive Changes in CDP
Apache Hive Changes in CDP
Apache Hive Changes in CDP
Apache Hive Expedited Migration Tasks
Apache Hive Expedited Migration Tasks
Apache Hive Expedited Migration Tasks
Apache Hive features Apache Hive features in Cloudera Data Hub Apache Hive features in Cloudera Data Warehouse
Apache Hive Post-Upgrade Tasks
Apache Hive Post-Upgrade Tasks
Apache Hive Post-Upgrade Tasks
Apache Impala changes in CDP
Apache Impala changes in CDP
Apache Knox - create plugin audit directory
Apache Knox - create plugin audit directory
Apache Ranger TLS Post-Upgrade Tasks
Apache Ranger TLS Post-Upgrade Tasks
Apache Sqoop Changes
Apache Sqoop Changes
Apache Tez processing of Hive jobs
Apache Tez processing of Hive jobs
Apache ZooKeeper ACLs: YARN
Applications Upgrade
Applications Upgrade
Applications Upgrade
Applications Upgrade
Applications Upgrade
Applying a Service Pack
Applying custom Kudu configuration
Assessing the Impact of an Upgrade
Assessing the Impact of an Upgrade
Assessing the Impact of Apache Hive
Assumptions and prerequisites
Assumptions and prerequisites
Atlas
Atlas
Atlas
Atlas
Atlas
Atlas advanced configuration snippet (Safety valve)
Atlas advanced configuration snippet (Safety valve)
Atlas migration
Atlas Migration and HBase Hook settings
Atlas replication policies (technical preview)
Authorization Provider for Impala
Authorization Provider for Impala
Authorization Provider for Impala
Authzmigrator tool
Auto-converted Fair Scheduler properties
Auto-converted Fair Scheduler properties
Auto-converted Fair Scheduler properties
Automated rollback
Automatic Invalidation/Refresh of Metadata
Automatic Refresh of Tables on Impala Clusters
Automatic Row Count Estimation
Back up Ambari Infra Solr data
Back up Ambari Infra Solr data
Back up and upgrade Ambari infra and Ambari Log Search
Back up and upgrade Ambari infra Solr and Ambari Log Search
Back Up Cloudera Manager Agent
Back Up Cloudera Manager Agent
Back Up Cloudera Manager Agent
Back Up Cloudera Manager Agent
Back Up Cloudera Manager Agent
Back Up Cloudera Manager Agent
Back Up Cloudera Manager Server
Back Up Cloudera Manager Server
Back Up Cloudera Manager Server
Back Up Cloudera Manager Server
Back Up Cloudera Manager Server
Back Up Cloudera Manager Server
Back Up Cloudera Navigator Data
Back Up Cloudera Navigator Data
Back Up Cloudera Navigator Data
Back Up Cloudera Navigator Data
Back up Solr configuration and data
Back Up the Cloudera Management Service
Back Up the Cloudera Management Service
Back Up the Cloudera Management Service
Back Up the Cloudera Management Service
Back Up the Cloudera Management Service
Back Up the Cloudera Management Service
Back Up the Databases
Back Up the Databases
Back Up the Databases
Back Up the Databases
Back Up the Databases
Back Up the Databases
Back up the Ranger configuration
Backing up Ambari infra data
Backing up Ambari infra Solr data
Backing up Cloudera Manager databases
Backing up Cloudera Manager databases
Backing up Cloudera Manager databases
Backing up Cloudera Manager databases
Backing up data in Kudu
Backup Ambari
Backup Ambari
Backup Ambari
Backup Ambari Infra Solr
Backup Ambari Infra Solr
Backup Ambari-Metrics
Backup Ambari-Metrics
Backup and Restore Databases
Backup and Restore Databases
Backup Atlas
Backup Atlas
Backup databases
Backup databases
Backup HBase
Backup HBase
Backup HBase tables
Backup HBase tables
Backup HDFS
Backup HDFS
Backup HDP Cluster
Backup HDP Cluster
Backup HDP services from CDP 7.1.x
Backup Hive
Backup Hive
Backup Infra Solr collections
Backup Infra Solr collections
Backup Kafka
Backup Kafka
Backup Knox
Backup Knox
Backup Logsearch
Backup Logsearch
Backup Oozie
Backup Oozie
Backup Ranger
Backup Ranger
Backup Ranger Admin Database
Backup Ranger Admin Database
Backup Ranger KMS Database
Backup Ranger KMS Database
Backup the non-default Rack Awareness Topology script
Backup the non-default Rack Awareness Topology script
Backup Zeppelin
Backup Zeppelin
Backup ZooKeeper
Backup ZooKeeper
Before upgrading any cluster
Before upgrading any cluster
Before upgrading any cluster
Before upgrading any cluster
Before you begin
Before you begin
Before you begin
Before you upgrade
Before you upgrade
Before you upgrade Ambari
Before you upgrade Ambari
Bootstrapping Solr collections
Browsing HDFS directories
Capture Information about Multiple HiveServers
Case study for setting up an HDP-GPL local repository
Case study for setting up an HDP-GPL local repository
Case study for setting up local repository
Case study for setting up local repository
CAST function results
Casting invalid dates
Casting invalid dates
Casting invalid dates
Casting invalid dates
Casting invalid dates
Casting Timestamp to numeric and vice-versa
Casting timestamps
Casting timestamps
Casting timestamps
Casting timestamps
Casting timestamps
Casting types with leading or trailing spaces
CDH 5 to CDP Private Cloud Base
CDH 5 to Cloudera Base on premises post-upgrade transition steps
CDH 6 to CDP Private Cloud Base
CDH 6 to Cloudera Base on premises post-upgrade transition steps
CDH cluster upgrade requirements for Replication Manager
CDH cluster upgrade requirements for Replication Manager
CDH/HDP to Cloudera Base on premises
CDP Upgrade Readiness
CDP Upgrade Readiness
Change Kafka port value
Change Kafka port value
Change location of Datafiles
Changes to Ambari and HDP services
Changes to Ambari and HDP services
Changes to Ambari services and views
Changes to CDH and HDP Components in Cloudera Base on premises
Changes to CDH Hive Tables
Changes to CDH Hive Tables
Changes to CDH Hive Tables
Changes to CDH Hive Tables
Changes to CDH Hive Tables
Changes to HDP Hive tables
Changes to HDP Hive tables
Changes to HDP Hive tables
Changes to HDP Hive tables
Changes to HDP Hive tables
Changing incompatible column types
Changing incompatible column types
Changing incompatible column types
Changing incompatible column types
Changing incompatible column types
Changing incompatible column types
Changing the Hive warehouse location
Changing the Hive warehouse location
Check cluster configuration for Hive and Tez
Check cluster configuration for Hive and Tez
Check co-processor classes
Check co-processor classes
Check MySQL isolation configuration
Check Oracle Database Initialization
Check Oracle Database Initialization
Check SERDE Definitions and Availability
Check SERDE Definitions and Availability
Check SERDE Definitions and Availability
Check SERDE Definitions and Availability
Check SERDE Definitions and Availability
Check SERDE Definitions and Availability
Check SERDE Definitions and Availability
Check SERDE Definitions and Availability
Check SERDE Definitions and Availability
Checking and correcting Hive table locations
Checking and correcting Hive table locations
Checking Apache HBase
Checking Apache HBase
Checking co-processor classes
Checklist for large clusters
Checklist for large clusters
Checkpoint HDFS
Checkpoint HDFS
Classic ingest patterns
Clean the HBase Master procedure store
Cleaning up data on the HDFS source
CLI options of the fs2cs conversion tool
CLI options of the fs2cs conversion tool
CLI options of the fs2cs conversion tool
Client Configurations
Client Configurations
Cloudera AI Project Migration
Cloudera Base on premises Pre-upgrade transition steps
Cloudera Base on premises Pre-upgrade transition steps
Cloudera Base on premises Release Guide
Cloudera Base on premises requirements and supported versions
Cloudera Base on premises Trial Download Information
Cloudera Base on premises Upgrade
Cloudera Data Engineering Concepts
Cloudera license requirements for Replication Manager
Cloudera Manager
Cloudera Manager
Cloudera Manager Download Information
Cloudera Manager Installation and Setup
Cloudera Manager Installation and Setup
Cloudera Manager Installation and Setup
Cloudera Manager Server
Cloudera Manager Support Matrix
Cloudera Manager Version Information
Cloudera Manager versions 7.1.1 to 7.2.4
Cloudera Manager versions 7.1.1 to 7.2.4
Cloudera Manager versions 7.1.1 to 7.2.4
Cloudera Manager versions 7.3.1 or higher
Cloudera Manager versions 7.3.1 or higher
Cloudera Manager versions 7.3.1 or higher
Cloudera Release Notes
Cloudera Runtime
Cloudera Runtime Download Information
Cloudera Runtime Version Information
Cloudera Search changes
Cloudera Search post-upgrade tasks
Cloudera Upgrade and Migrations Paths
Cluster environment readiness
Cluster environment readiness
Cluster environment readiness
Collect Information
Collect Information
Collect Information
Collect Information
Collect Information
Collect Information
Compatible storage formats
Compiling and running a Java-based job
Compiling and running a Java-based job
Compiling and running a Java-based job
Compiling and running a Java-based job
Compiling and running a Scala-based job
Compiling and running a Scala-based job
Compiling and running a Scala-based job
Compiling and running a Scala-based job
Compiling and running Spark workloads
Compiling and running Spark workloads
Compiling and running Spark workloads
Compiling and running Spark workloads
Compiling and running Spark workloads
Compiling and running Spark workloads
Compiling and running Spark workloads
Compiling and running Spark workloads
Compiling and running Spark workloads
Configuration changes
Configuration changes
Configuration changes between MRv1 and MRv2
Configuration options available in Cloudera Data Warehouse by default
Configure Apache Knox authentication for AD/LDAP
Configure Apache Knox authentication for AD/LDAP
Configure edge nodes as gateways
Configure edge nodes as gateways
Configure edge nodes as gateways
Configure edge nodes as gateways
Configure edge nodes as gateways
Configure edge nodes as gateways
Configure encryption zone security
Configure encryption zone security
Configure encryption zone security
Configure encryption zone security
Configure encryption zone security
Configure encryption zone security
Configure file access control lists for Impala user
Configure HiveServer HTTP mode
Configure HiveServer HTTP mode
Configure HiveServer HTTP mode
Configure HiveServer HTTP mode
Configure HiveServer HTTP mode
Configure HiveServer HTTP mode
Configure parameters for Hive ACID table replication policies
Configure SMM to monitor SRM replications
Configure SMM to monitor SRM replications
Configure SMM's service dependency on Schema Registry
Configure SMM's service dependency on Schema Registry
Configure SSL/TLS in Ambari
Configure SSL/TLS in Ambari
Configure TLS/SSL for Ranger in a manually configured TLS/SSL environment
Configure TLS/SSL for Ranger in a manually configured TLS/SSL environment
Configure two-way trust between clusters
Configuring a Custom Java Home Location
Configuring a Custom Java Home Location
Configuring a Custom Java Home Location
Configuring a Local Package Repository
Configuring a Local Package Repository
Configuring a Local Package Repository
Configuring a Local Package Repository
Configuring a Local Parcel Repository
Configuring a Local Parcel Repository
Configuring a Local Parcel Repository
Configuring a Local Parcel Repository
Configuring a Ranger Database: PostgreSQL
Configuring a Ranger Database: PostgreSQL
Configuring a Ranger or Ranger KMS Database: MySQL/MariaDB
Configuring a Ranger or Ranger KMS Database: MySQL/MariaDB
Configuring a Ranger or Ranger KMS Database: Oracle
Configuring a Ranger or Ranger KMS Database: Oracle
Configuring authorization to tables
Configuring authorization to tables
Configuring authorization to tables
Configuring authorization to tables
Configuring authorization to tables
Configuring authorization to tables
Configuring DAS for SSL/TLS
Configuring DAS for SSL/TLS
Configuring External Authentication for Cloudera Manager
Configuring External Authentication for Cloudera Manager
Configuring HDFS properties to optimize log collection
Configuring HDFS properties to optimize log collection
Configuring heap size to replicate large directories using replication policies
Configuring HiveServer for ETL using YARN queues
Configuring HiveServer for ETL using YARN queues
Configuring HiveServer for ETL using YARN queues
Configuring HiveServer for ETL using YARN queues
Configuring HMS for high availability
Configuring HMS for high availability
Configuring HMS for high availability
Configuring HMS for high availability
Configuring HMS for high availability
Configuring HMS for high availability
Configuring HSMM to prevent migration
Configuring HSMM to prevent migration
Configuring legacy CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Configuring legacy CREATE TABLE behavior
Configuring peers with SAML authentication
Configuring properties for OBS bucket replication using Ozone replication policies
Configuring Ranger ACL Sync
Configuring SSL/TLS certificate exchange between two Cloudera Manager instances
Configuring storage locations
Configuring storage locations
Configuring storage locations
Configuring storage locations
Configuring storage locations
Configuring storage locations
Configuring storage locations
Configuring storage locations
Configuring storage locations
Configuring the CDP cluster
Configuring the HDP cluster
Configuring user authentication in Ambari
Configuring user authentication in Ambari
Configuring user authentication using Knox proxy
Configuring user authentication using Knox proxy
Configuring user authentication using Knox SSO
Configuring user authentication using Knox SSO
Configuring user authentication using SPNEGO
Configuring user authentication using SPNEGO
Configuring wire-encrypted clusters
Considerations for realm names to use for replication
Considerations for transferring data from HDFS to Ozone
Consolidating policies created by Authzmigrator
Convert Spark Submit commands to CDE CLI Spark Submit commands
Convert Spark Submits to CDE API Requests
Converting Hive CLI scripts to Beeline
Converting Hive CLI scripts to Beeline
Converting old mapping rule format to JSON-based placement rule format
Converting old mapping rule format to JSON-based placement rule format
Copy Raw Data
Copy Raw Data
Copy the transitioned configuration to the upgrade metadata directory
Copying the backed up data to the Cloudera cluster
CORR and COVAR_SAMP compliant with SQL:2011
Correct Hive File Locations
Create composite keytab for Ranger HA
Create HDFS backup directory
Create HIVE sys database
CREATE OR REPLACE VIEW and ALTER VIEW not supported
CREATE OR REPLACE VIEW and ALTER VIEW not supported
Creating a Hive external table replication policy
Creating a list of tables to migrate
Creating a list of tables to migrate
Creating a list of tables to migrate
Creating additional Ranger policies for the keyadmin user
Creating Atlas replication policies
Creating HDFS replication policy to replicate HDFS data
Creating Hive ACID table replication policy
Creating Iceberg replication policy
Creating Ozone replication policies
Creating Ranger replication policies
Creating snapshot policies in Replication Manager
CSV bad record handling
CSV bad record handling
CSV bad record handling
CSV bad record handling
CSV bad record handling
CSV bad record handling
CSV bad record handling
CSV bad record handling
CSV bad record handling
CSV header and schema match
CSV header and schema match
CSV header and schema match
CSV header and schema match
CSV header and schema match
CSV header and schema match
CSV header and schema match
CSV header and schema match
CSV header and schema match
Custom Spark SQL Warehouse Directory
Custom Topology
Custom Topology
Customizing CDSW for migrating host mounts
Customizing critical Hive configurations
Customizing critical Hive configurations
Customizing critical Hive configurations
Customizing critical Hive configurations
Customizing critical Hive configurations
Customizing critical Hive configurations
Customizing the authorization-migration-site.xml file
DAS does not log me out as expected, or I stay logged in longer than the time specified in the Ambari configuration
DAS does not log me out as expected, or I stay logged in longer than the time specified in the Ambari configuration
DAS post-installation tasks
DAS post-installation tasks
DAS service installation fails with the "python files missing" message
DAS service installation fails with the "python files missing" message
Data at Rest Encryption Requirements
Data Governance Support by Atlas
Data Governance Support by Atlas
Data Governance Support by Atlas
Data Migration Tools and Methods for Cloudera Private Cloud
Data Migration Versus Upgrade
Database Requirements
Databases
Databases
Dataframe API registerTempTable deprecated
Dataframe API registerTempTable deprecated
Dataframe API registerTempTable deprecated
Dataframe API registerTempTable deprecated
Dataset and DataFrame API explode deprecated
Dataset and DataFrame API explode deprecated
Dataset and DataFrame API explode deprecated
Dataset and DataFrame API explode deprecated
Date and timestamp format changes
Date and timestamp parser changes from LENIENT to STRICT
Date strings are parsed using local timezone
Dates are parsed by ignoring trailing invalid characters
Decimal literals
Decimal to string change
Default File Formats
Default Managed Tables
Default NULL ordering change
Delete ZNodes
Delete ZNODES
Dell EMC PowerScale
Disabling Partition Type Checking
Disabling Partition Type Checking
Disabling Partition Type Checking
Disabling replication of parameters during Hive replication
Disallow enabling/enforcing NOT NULL
Disk space and mountpoint considerations
Disk space and mountpoint considerations
Disk space and mountpoint considerations
Download cluster blueprints
Download cluster blueprints
Download cluster blueprints without hosts
Download cluster blueprints without hosts
Download Solr configuration from HDP Search ZooKeeper
Download Solr configuration from HDP Search ZooKeeper
Download the configuration
Download the Pre-Upgrade Tool JAR for Compaction
Downloadable Cloudera upgrade checklists
Downloading and Publishing the Package Repository
Downloading and Publishing the Package Repository
Downloading and Publishing the Package Repository
Downloading and Publishing the Parcel Repository
Downloading and Publishing the Parcel Repository
Downloading and Publishing the Parcel Repository
Dropping partitions
Dropping partitions
Dropping partitions
Dropping partitions
Dropping partitions
Empty schema not supported
Empty schema not supported
Empty schema not supported
Empty schema not supported
Empty schema not supported
Empty schema not supported
Empty schema not supported
Empty schema not supported
Empty schema not supported
Enable Auto Start setting
Enable Auto Start setting
Enable Kerberos for authentication
Enable Kerberos for authentication
Enabling and disabling HDFS snapshots
Enabling background threads after migration
Enabling logout option for secure clusters
Enabling logout option for secure clusters
Enabling replication between clusters with Kerberos authentication
Enabling SAC manually on Spark
Enabling Spark on YARN for Atlas
Encrypting data in transit between clusters
Enforcement of NOT NULL constraint
Estimating migration time and resources required
Estimating the time and resources needed for transition
Event notification cleanup interval
Event Server
Example commands for replicating HDP 3 workloads
Exclusive write lock for MERGE INSERT
Execution steps
Expediting the Hive upgrade
Expediting the Hive upgrade
Expediting the Hive upgrade
Exporting Permissions from Sentry Server
Extract Kafka broker ID
Extract Kafka broker ID
Factors to Consider for Capacity Planning
Fair Scheduler features and conversion details
Fair Scheduler features and conversion details
Fair Scheduler features and conversion details
Fair Scheduler to Capacity Scheduler transition
Fair Scheduler to Capacity Scheduler transition
Fix precision and scale inference for aggregate rewriting in Calcite
Fixing statistics
Fixing statistics
Fixing the canary test after upgrading
Flow management
From Cloudera Base on premises
FROM_UNIXTIME and UNIX_TIMESTAMP time zone
Functions that changed
Generate migration configuration
Generate migration configuration
Generating keytabs in Cloudera Manager
Get a Kerberos Ticket
Getting a 401 - Unauthorized access error message while accessing DAS
Getting a 401 - Unauthorized access error message while accessing DAS
Getting Started with CDE Airflow
Getting Started with Cloudera Upgrade and Migration
Getting started with Zero Downtime Upgrade (ZDU)
Getting the JDBC driver
Getting the JDBC driver
Getting the JDBC driver
Getting the JDBC driver
Getting the ODBC driver
Getting the ODBC driver
Getting the ODBC driver
Getting the ODBC driver
Glossary of terminologies
GPU Scheduling
Grant Ranger permissions to new users or groups
Grant Ranger permissions to new users or groups
Guidelines for moving repeated BI workloads
Guidelines to use snapshot diff-based replication
Hadoop Users (user:group) and Kerberos Principals
Hadoop Users (user:group) and Kerberos Principals
Hadoop Users (user:group) and Kerberos Principals
Handing a failed verification
Handle Missing Table or Partition Locations
Handle Missing Table or Partition Locations
Handle Missing Table or Partition Locations
Handle Missing Table or Partition Locations
Handle Missing Table or Partition Locations
Handle Missing Table or Partition Locations
Handle Missing Table or Partition Locations
Handle Missing Table or Partition Locations
Handling Data Files
Handling Data Files
Handling government regulations in ACID tables
Handling of CURRENT_TIMESTAMP output format
Handling of Julian dates in UDFs
Handling output of greatest and least functions
Handling output of greatest and least functions
Handling output of greatest and least functions
Handling output of greatest and least functions
Handling output of greatest and least functions
Handling output of greatest and least functions
Handling prerequisites
Handling prerequisites
Handling prerequisites
Handling prerequisites
Handling prerequisites
Handling prerequisites
Handling prerequisites
Handling prerequisites
Handling prerequisites
Handling return type for old date functions
Handling syntax changes
Handling table column named default
Handling table reference syntax
Handling table reference syntax
Handling table reference syntax
Handling table reference syntax
Handling table reference syntax
Handling table reference syntax
Handling table reference syntax
Handling the Keyword APPLICATION
Handling the Keyword APPLICATION
Handling the Keyword APPLICATION
Handling the Keyword APPLICATION
Handling trailing zeros of decimal constants
Hardware Requirements
HAVING without GROUP BY
HAVING without GROUP BY
HAVING without GROUP BY
HAVING without GROUP BY
HAVING without GROUP BY
HAVING without GROUP BY
HAVING without GROUP BY
HAVING without GROUP BY
HAVING without GROUP BY
HBase
HBase
HBase
HBase
HBase
HBase
HBase
HBase
HBase RegionServer heap size
HBase unsupported features
HDF Streaming Workloads
HDFS
HDFS
HDFS
HDFS
HDFS
HDFS
HDFS
HDFS
HDFS
HDFS
HDFS dependency on running Ozone
HDFS HA
HDFS HA
HDFS namespace mapping to Ozone
HDFS replication in Sentry-enabled clusters
HDFS replication policies
HDFS replication policy considerations
HDP 3.1.5 to HDP 7.1.7 Intermediate bits Kafka upgrade
HDP Core component version changes
HDP Core component version changes
HDP Core component version changes
HDP Intermediate bits for 7.1.x.0 Repositories
HDP Intermediate bits for 7.1.x.0 Repositories
HDP Prerequisites
HDP Prerequisites
HDP to CDP Private Cloud Base One Stage upgrade
HDP to CDP Private Cloud Base Upgade
HDP to CDP Upgrade Overview
HDP to CDP Upgrade Overview
HDP upgrade checklist
HDP upgrade checklist
HDP2 to CDP Private Cloud Base Two Stage upgrade
HDP3 to CDP Private Cloud Base Two Stage upgrade
Hive
Hive
Hive
Hive
Hive
Hive
Hive
Hive ACID table replication policies
Hive Configuration Changes Requiring Consent
Hive Configuration Changes Requiring Consent
Hive Configuration Property Changes
Hive Configuration Property Changes
Hive Configuration Property Changes
Hive Configuration Property Changes
Hive Configuration Property Changes
Hive Configuration Property Changes
Hive Configuration Requirements and Recommendations
Hive Configuration Requirements and Recommendations
Hive Configuration Requirements and Recommendations
Hive Configuration Requirements and Recommendations
Hive Configuration Requirements and Recommendations
Hive Configuration Requirements and Recommendations
Hive external table replication policies
Hive ingest patterns introduction
Hive Metastore corrupt
Hive Policy Additions
Hive post-HDP-upgrade tasks
Hive Pre-Upgrade Tool Command Help
Hive replication policy considerations
Hive tables and DDL commands
Hive unsupported interfaces and features
Hive unsupported interfaces and features
Hive unsupported interfaces and features
Hive unsupported interfaces and features
Hive unsupported interfaces and features
Hive unsupported interfaces and features
hive.stats.collect.rawdatasize removal
HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
HMS health check
Host Monitor
How HDFS replication policy works
How Iceberg replication policy works
How much time should I plan for to complete my upgrade?
How much time should I plan for to complete my upgrade?
How much time should I plan for to complete my upgrade?
How much time should I plan for to complete my upgrade?
How much time should I plan for to complete my upgrade?
How Ranger replication policy works
How Replication Manager uses snapshots
How the migration process works
How to Install Cumulative Hotfix (CHF)
How to use the post copy reconciliation script for HDFS replication policies
HSM Support
Hue
Hue
Hue
Hue post-upgrade tasks
Hue post-upgrade tasks
HWC changes from HDP to CDP
IBM Spectrum Scale
Iceberg replication policies
Identifying and fixing invalid Hive schema versions
Identifying and fixing invalid Hive schema versions
Identifying and fixing invalid Hive schema versions
Impala
Impala
Impala
Impala Changes between CDH and CDP
Impala Configuration Changes
Impala Configuration Changes
Impala configuration differences in CDH and CDP
Import Key Trustee KMS ACLs to Ranger KMS policies
Importing Sentry privileges into Ranger policies
Importing Sentry privileges into Ranger policies
Importing Sentry privileges into Ranger policies
Improve network latency during replication job run
Improved Handling of External Table Inserts in HDFS
In-Place and Restore
In-Place and Restore
In-Place and Restore
In-place data upgrade from Accumulo 1.7.0 in HDP 2 to Accumulo 1.10
In-place data upgrade from Accumulo 1.7.0 in HDP 3 to Accumulo 1.10
In-place data upgrade from Accumulo 1.7.2 in CDH 5 to Accumulo 1.10
In-place data upgrade from Accumulo 1.9.2 in CDH 6 to Accumulo 1.10
In-place data upgrade from CDH 6 to Operational Database
In-Place Upgrade Overview
In-Place Upgrade Overview
In-Place Upgrade Overview
In-place upgrade with new nodes for Cloudera Data Warehouse Data Service
Install and configure MySQL database
Install and configure MySQL database
Install software on the hosts
Install software on the hosts
Installing Atlas in the Cloudera Manager upgrade wizard
Installing Atlas in the Cloudera Manager upgrade wizard
Installing Atlas using the Cloudera Manager upgrade wizard
Installing Cloudera Management Service
Installing DAS using Ambari
Installing DAS using Ambari
Installing DAS using Cloudera Manager
Installing DAS using Cloudera Manager
Installing dependencies for Hue
Installing dependencies for Hue
Installing dependencies for Hue
Installing Hive on Tez and adding a HiveServer role
Installing Hive on Tez and adding a HiveServer role
Installing Hive on Tez and adding a HiveServer role
Installing Hive on Tez and adding a HiveServer role
Installing JDBC Driver
Installing JDBC Driver
Installing MySQL client for MariaDB databases
Installing MySQL client for MariaDB databases
Installing MySQL client for MariaDB databases
Installing MySQL client for MariaDB databases
Installing MySQL client for MariaDB databases
Installing MySQL client for MySQL databases
Installing MySQL client for MySQL databases
Installing MySQL client for MySQL databases
Installing MySQL client for MySQL databases
Installing MySQL client for MySQL databases
Installing Python 3.8
Installing Python 3.8
Installing Python 3.8 on CentOS 7 for Hue
Installing Python 3.8 on CentOS 7 for Hue
Installing Python 3.8 on RHEL 8 for Hue
Installing Python 3.8 on RHEL 8 for Hue
Installing Python 3.8 on SLES 12 for Hue
Installing Python 3.8 on SLES 12 for Hue
Installing Python 3.8 on Ubuntu 18 for Hue
Installing Python 3.8 on Ubuntu 18 for Hue
Installing the cnav.sh and dependencies
Installing the new Shared Libraries
Installing the new Shared Libraries
Installing the psycopg2 Python package for PostgreSQL database
Installing the psycopg2 Python package for PostgreSQL database
Installing the psycopg2 Python package for PostgreSQL database
Installing the psycopg2 Python package for PostgreSQL database
Installing the psycopg2 Python package for PostgreSQL database
Integrate Hue with Knox
Integrate Hue with Knox
Interoperability between Hive and Impala
Interoperability between Hive and Impala
Interoperability between Hive and Impala
Java Requirements
Kafka
Kafka
Kafka
Kafka
Kafka
Kafka
Kafka cluster Kerberos
Kerberos
Kerberos
Kerberos
Kerberos - Optional task
Kerberos cluster
Kerberos cluster
Kerberos connectivity test
Kerberos principal
Kerberos principal
Kerberos Principal for Cloudera Manager Server
Key concepts about ACID ingest patterns
Key migration
Key migration in UCL
Key semantic changes and workarounds
Key semantic changes and workarounds
Key semantic changes and workarounds
Key semantic changes and workarounds
Key semantic changes and workarounds
Key semantic changes and workarounds
Key syntax changes
Key syntax changes
Key Trustee KMS operations not supported by Ranger KMS
Key Trustee KMS operations not supported by Ranger KMS
Key Trustee Server
KMS and Encryption Products
Knox
Knox
Knox
Knox
Knox
KTS and Key HSM
Kudu
Kudu
Kudu migration overview
LENGTH function supported data types
Limit scanned partitions
Limitations of migrating CDSW to Cloudera AI
Livy
Livy2
Livy2
LOCATION and MANAGEDLOCATION clauses
LOCATION and MANAGEDLOCATION clauses
LOCATION and MANAGEDLOCATION clauses
LOCATION and MANAGEDLOCATION clauses
LOCATION and MANAGEDLOCATION clauses
Lock implementations to allow zero-wait readers
Log Search
Log Search
Major changes when migrating to MapReduce 2
Make Tables SparkSQL Compatible
Make Tables SparkSQL Compatible
Make Tables SparkSQL Compatible
Make Tables SparkSQL Compatible
Make Tables SparkSQL Compatible
Make Tables SparkSQL Compatible
Make Tables SparkSQL Compatible
Making the Hive plugin for Ranger visible
Making the Hive plugin for Ranger visible
Making the Hive plugin for Ranger visible
Making the Hive plugin for Ranger visible
Manage and monitor Iceberg replication policies
Manage and monitor snapshot policies
Manage, monitor, and troubleshoot Atlas replication policies
Managed table location
Managed table location
Managed table location
Managed table location
Managed table location
Managed table location
Managed table location
Managed table location
Managed table location
Managed Table Location Mapping
Managed Table Location Mapping
Managed Table Location Mapping
Managed Table Location Mapping
Managed Table Location Mapping
Managed Table Location Mapping
Managed Table Location Mapping
Managing HDFS snapshots in Cloudera Manager
Managing Hive ACID table replication policies
Managing MPacks
Managing MPacks
Managing Ozone replication policies
Managing Ranger replication policies
Managing replication policies
Mandatory CDP policy-level properties
Mandatory HDP cluster configuration properties
Mandatory HDP policy-level properties
Manual configuration of scheduler properties
Manual configuration of scheduler properties
Manual configuration of scheduler properties
Manual rollback
Manual upgrade to Cloudera Base on premises
Manual upgrade to Cloudera Base on premises
Manual upgrade to Cloudera Base on premises
Manually add the configurations of yarn-site.xml
Manually add the configurations of yarn-site.xml
Map Phoenix schemas to HBase namespaces
Map Phoenix schemas to HBase namespaces
Mapping Navigator metadata to Atlas
Mapping Navigator metadata to Atlas
Mapping Navigator metadata to Atlas
Mapping Sentry permissions for Solr to Ranger policies
Merge Independent Hive and Spark Catalogs
Merge Independent Hive and Spark Catalogs
Metadata Improvements
Migrate audit reports and processes to Ranger
Migrate Credential Aliases
Migrate Credential Aliases
Migrate from Cloudera Base on premises (Hive on Tez) to Cloudera Data Warehouse (LLAP)
Migrate Hive workloads from HDP (LLAP) to Cloudera Data Warehouse (LLAP)
Migrate Kafka Using Streams Replication Manager
Migrate Kafka Using the DefaultReplicationPolicy
Migrate Kafka Using the IdentityReplicationPolicy
Migrate Kafka Using the MigratingReplicationPolicy
Migrate Ranger Policies
Migrate Schema Registry
Migrate signing key
Migrate signing key
Migrate Streams Messaging Manager
Migrated features and limitations
Migrating Accumulo to Cloudera
Migrating ACLs from Key Trustee KMS to Ranger KMS
Migrating ACLs from Key Trustee KMS to Ranger KMS
Migrating Atlas data
Migrating Atlas data
Migrating Data Science Workbench (CDSW) to Cloudera AI
Migrating Data Science Workbench to Cloudera AI
Migrating data to Cloudera Base on premises from CDH using Replication Manager
Migrating Fair Scheduler to Capacity Scheduler for Cloudera Private Cloud
Migrating from Sentry to Ranger
Migrating HBase
Migrating HBase from CDH or HDP
Migrating HDFS data to Ozone for Cloudera Private Cloud
Migrating Hive 1-2 to Hive 3
Migrating Hive 1-2 to Hive 3
Migrating Hive workloads from CDH
Migrating Hive workloads from Cloudera Base on premises to Cloudera Data Warehouse on premises
Migrating Hive workloads from HDP 2.6.5
Migrating Hive workloads to ACID
Migrating Hive Workloads to Cloudera Private Cloud
Migrating Impala Data Overview
Migrating Impala Data to Cloudera Private Cloud
Migrating Impala Workloads to Cloudera Private Cloud
Migrating keys from Key Trustee Server to Ranger KMS
Migrating keys from Key Trustee Server to Ranger KMS
Migrating Kudu Data to Cloudera Private Cloud
Migrating Navigator audit information into Atlas
Migrating Navigator content to Atlas
Migrating Navigator content to Atlas
Migrating Navigator data using customized scripts
Migrating Navigator to Atlas for Cloudera Private Cloud with Cloudera Manager 6
Migrating Navigator to Atlas for Cloudera Private Cloud with Cloudera Manager 7
Migrating Navigator when upgrading Cloudera Manager 6 with CDH 6 clusters
Migrating Navigator when upgrading Cloudera Manager 7 with CDH 6 to Cloudera Runtime 7.1.9
Migrating Oozie configurations to Cloudera
Migrating Oozie Configurations to Cloudera Private Cloud
Migrating Oozie Shared Libraries
Migrating Operational Database to Cloudera Private Cloud
Migrating Projects
Migrating scheduler using the fs2cs conversion utility
Migrating Sentry to Ranger for Cloudera Private Cloud
Migrating Services
Migrating Services
Migrating Spark Apps
Migrating Spark Apps
Migrating Spark CDP to Cloudera Data Engineering
Migrating Spark Data to Cloudera Private Cloud
Migrating Spark workloads to
Migrating Spark workloads to
Migrating Spark workloads to Cloudera
Migrating Spark workloads to Cloudera
Migrating Spark workloads to Cloudera
Migrating Streaming workloads from HDF to Cloudera Private Cloud Base
Migrating Streaming Workloads to Cloudera Private Cloud
Migrating tables to CDP
Migrating tables to CDP
Migrating tables to CDP
Migrating to Accumulo 1.10.0
Migrating to Operational Database
Migrating Workloads
Migrating your data from HDFS to Ozone
Migration assumptions and prerequisites
Migration options
Migration paths for Hive users
Migration paths for Spark users
Migration paths from HDP 3 to CDP for LLAP users
Migration to Cloudera Data Warehouse
Migration to Cloudera Private Cloud Base
Migration to Cloudera Private Cloud Base or CDP Public Cloud
Missing Hive tables
Modifying peers to use in replication policy
Modifying the HSMM to prevent migration
Monitoring the performance of HDFS replication policies
Monitoring the performance of Hive/Impala replication policies
Move and Decommission
Move and Decommission
Move and Decommission
Move Atlas out of migration mode
Moving Atlas out of migration mode
Moving Atlas out of migration mode
Moving data from HDFS to Ozone using the distcp command
Moving Impala compute workloads from Cloudera Base on premises to Cloudera Data Warehouse Data Service on premises
Navigator Encrypt
Navigator Encrypt key migration with HSM
Navigator to Atlas
Networking and Security Requirements
New property to control datetime formatter
New Spark entry point SparkSession
New Spark entry point SparkSession
New Spark entry point SparkSession
New Spark entry point SparkSession
NULL related behaviors
NVL UDF implementation changes
Oozie
Oozie
Oozie
Oozie
Oozie
Oozie
Oozie
Oozie
Oozie database prerequisites
Oozie Load Balancer configuration
Oozie Load Balancer configuration
Operating System Requirements
Optional CDP policy-level properties
Optional HDP policy-level properties
ORC Support Disabled for Full-Transactional Tables
ORDER BY clause treatment of NULLs
Other Miscellaneous Changes in Impala
Other review configurations for HDFS
Other review configurations for HDFS
Other syntax and semantic changes
Overflow handling of decimals
Overview
Overview
Overview
Overview
Overview of the expedited Hive upgrade
Overview of the expedited Hive upgrade
Overview of the expedited Hive upgrade
Overview of the Migration of the Atlas and Infra Solr Data
Ozone
Ozone
Ozone configurations
Ozone namespace concepts
Ozone replication policies
Parameters to optimize Hive ACID table replication performance
Patches required on CDP
Patches Required on HDP
Perform express upgrade
Perform express upgrade
Perform the HDP upgrade
Perform the HDP upgrade
Performance and scalability limitations to consider for replication policies
Performance Differences between CDH and CDP
Permission models on Ozone
Phoenix
Phoenix
Phoenix
Place Atlas in migration mode
Placement rules evaluation engine
Placement rules evaluation engine
Placement rules migration
Placement Rules transition
Placement Rules transition
Plan your scheduler transition
Plan your scheduler transition
Planning a Cloudera Data Warehouse Virtual Warehouse instance
Planning Capacity Using WXM
Planning your scheduler migration
Port and network requirements for Replication Manager on Cloudera Base on premises
Ports
Ports
Post transition steps
Post transition steps
Post upgrade operations for your CDH 6 clusters
Post upgrade operations for your CDH 6 clusters
Post-HDP-upgrade tasks
Post-HDP-upgrade tasks
Post-migration steps
Post-migration steps
Post-migration tasks
Post-migration tasks
Post-migration tasks
Post-migration tasks
Post-migration tasks
Post-migration tasks
Post-migration tasks
Post-migration tasks
Post-migration tasks
Post-migration tasks
Pre-migration steps
Pre-migration steps
Pre-transition steps
Pre-transition steps
Pre-upgrade steps
Precedence of set operations
Precedence of set operations
Precedence of set operations
Precedence of set operations
Precedence of set operations
Precedence of set operations
Precedence of set operations
Precedence of set operations
Precedence of set operations
Precision and scale changes
Prepare to create Hive ACID table replication policies
Prepare to replicate using replication policies
Preparing Ambari Repository Configuration File to use Local Repository
Preparing Ambari Repository Configuration File to use Local Repository
Preparing Atlas for upgrade
Preparing clusters for Ranger replication policy creation
Preparing clusters to replicate Ozone data
Preparing configurations
Preparing for data migration
Preparing for your upgrade
Preparing for your upgrade
Preparing HBase for upgrade
Preparing HBase for upgrade
Preparing HDFS
Preparing HDFS
Preparing HDP Search for upgrade
Preparing HDP Search for upgrade
Preparing Hive for upgrade
Preparing Kafka for upgrade
Preparing Kafka for upgrade
Preparing Kerberos authentication-enabled clusters for replication
Preparing Ozone for data ingestion
Preparing Spark for upgrade
Preparing tables for migration
Preparing tables for migration
Preparing tables for migration
Preparing tables for migration
Preparing tables for migration
Preparing tables for migration
Preparing the backend HMS database for upgrade
Preparing the backend HMS database for upgrade
Preparing the backend HMS database for upgrade
Preparing the backend HMS database for upgrade
Preparing the services for upgrade
Preparing to create Atlas replication policies
Preparing to create Iceberg replication policies
Preparing Zeppelin for upgrade
Preparing ZooKeeper for upgrade
Prerequisites for CDSW to Cloudera AI migration
Prerequisites for external database
Preventing SparkSQL incompatibility
Preventing SparkSQL incompatibility
Problem area: Compose page
Problem area: Compose page
Problem area: Queries page
Problem area: Queries page
Problem area: Reports page
Problem area: Reports page
Procedure to Downgrade from CDP Private Cloud Base 7.1.9
Procedure to Downgrade from Cloudera Base on premises 7.3.1
Procedure to Downgrade or Rollback from CDP Private Cloud Base 7.1.9
Procedure to Downgrade or Rollback from Cloudera Base on premises 7.3.1
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP1 to CDP Private Cloud Base 7.1.7
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP1 to CDP Private Cloud Base 7.1.7
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP1 to CDP Private Cloud Base 7.1.7
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP2 to CDP Private Cloud Base 7.1.7 SP1
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP3 to CDP Private Cloud Base 7.1.6
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP3 to CDP Private Cloud Base 7.1.7
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP3 to CDP Private Cloud Base 7.1.7 SP1
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP3 to CDP Private Cloud Base 7.1.7 SP2
Procedure to Rollback from CDP Private Cloud Base 7.1.8 to CDP Private Cloud Base 7.1.7 SP1
Procedure to Rollback from CDP Private Cloud Base 7.1.9
Procedure to Rollback from CDP Private Cloud Base 7.1.9 SP1 to CDP Private Cloud Base 7.1.7 SP3
Procedure to Rollback from CDP Private Cloud Base 7.1.9 SP1 to CDP Private Cloud Base 7.1.8 latest cumulative hotfix
Procedure to Rollback from CDP Private Cloud Base 7.1.9 SP1 to CDP Private Cloud Base 7.1.9
Procedure to Rollback from Cloudera Base on premises 7.3.1
Process of migrating the HDFS data to Ozone
Product Compatibility Matrices
Project Migration FAQs
Proxy Cloudera Manager through Apache Knox
Proxy Cloudera Manager through Apache Knox
Query column is empty, yet you can see the DAG ID and Application ID
Query column is empty, yet you can see the DAG ID and Application ID
Query column is not empty, but you cannot see the DAG ID and Application ID
Query column is not empty, but you cannot see the DAG ID and Application ID
Querying Hive managed tables from Spark
Querying Hive managed tables from Spark
Querying Hive managed tables from Spark
Querying Hive managed tables from Spark
Querying Hive managed tables from Spark
Querying Hive managed tables from Spark
Querying Hive managed tables from Spark
Querying Hive managed tables from Spark
Querying Hive managed tables from Spark
Ranger
Ranger
Ranger
Ranger
Ranger
Ranger
Ranger
Ranger
Ranger
Ranger admin password
Ranger admin password
Ranger database prerequisites
Ranger Installation in High Availability with Load Balancer
Ranger KMS
Ranger KMS
Ranger KMS
Ranger KMS
Ranger MySQL collation
Ranger policies allowing create privilege for Hadoop_SQL databases
Ranger policies allowing create privilege for Hadoop_SQL tables
Ranger replication policies
Ranger Service connection with Oracle database
Ranger Service connection with Oracle database
Recommended Hive configuration parameters for Hive ACID table replication policies
Reconnect to HS2 Session
Recreating aliases
Reference information
Referencing a corrupt JSON/CSV record
Referencing a corrupt JSON/CSV record
Referencing a corrupt JSON/CSV record
Referencing a corrupt JSON/CSV record
Register software repositories
Register software repositories
Registering Ambari Cloudera Manager pair for source cluster
Registering Ambari Cloudera Manager pair for target cluster
Reindexing Solr collections
Release Guide
Remove Existing Collections and Upgrade Binaries
Remove Existing Collections and Upgrade Binaries
Remove Hive Ranger property
Remove PREFIX_TREE Data Block Encoding
Remove PREFIX_TREE data block encoding
Remove transactional=false from Table Properties
Remove transactional=false from Table Properties
Removing cluster hosts and Kubernetes nodes
Removing Hive on Spark Configurations
Removing Hive on Spark Configurations
Removing Hive on Spark Configurations
Removing Hive on Spark Configurations
Removing Navigator role instances
Removing Navigator role instances
Removing PREFIX_TREE Data Block Encoding
Removing the LLAP Queue
Removing the LLAP Queue
Removing the LLAP Queue
Renaming tables
Renaming tables
Renaming tables
Renaming tables
Renaming tables
Renaming tables
Repl Command Known Issues
Replicate Impala and Hive User Defined Functions (UDFs)
Replicating data to Impala clusters
Replicating from unsecure to secure clusters
Replicating Hive data
Replicating Hive data from HDP 3 to CDP
Replication failure in the DAS Event Processor
Replication failure in the DAS Event Processor
Replication Manager
Replication Manager
Replication Manager in Cloudera Base on premises
Replication of encrypted data
Reports Manager
Repurposing CDSW nodes for Cloudera AI
Required ports in Kerberos authentication-enabled clusters for replication
Reset ZNode ACLs
Reset ZNode ACLs
Restore ATLAS_ENTITY_AUDIT_EVENTS table
Restore ATLAS_ENTITY_AUDIT_EVENTS table
Restore HBase Tables
Restore HBase Tables
Restore old configuration symlinks
Restore Ranger Admin Database
Restore Ranger Admin Database
Restore Ranger Admin Database
Restore Ranger KMS Database
Restore Ranger KMS Database
Restore Ranger KMS Database
Restore Solr collections on CDP cluster
Restore Solr collections on CDP cluster
Restore Solr snapshots
Restoring HDFS snapshots in Cloudera Manager
Restoring HDFS snapshots in Cloudera Manager
Restoring Kudu data into the target Cloudera cluster
Restoring Ozone snapshots in Cloudera Manager
Retaining logs for Replication Manager
Reuse Existing Database
Reuse Existing Storage
Revert to CDH-like Tables
Revert to CDH-like Tables
Reverting a Failed Upgrade
Reverting a Failed Upgrade
Reverting a Failed Upgrade
Review Ambari UI and the Quick Links
Roles and sizing considerations for Ozone
Rollback Ambari 7.1.x to Ambari 2.7.5
Rollback Ambari to 2.6.5
Rollback and Downgrade Cloudera Base on premises
Rollback HDP Services
Rollback HDP Services 3.1.5 from CDP 7.1.x
Rollback HDP services from CDP 7.1.x
Rollback of Ranger KMS DB to KTS
Rolling back a Cloudera Base on premises 7 upgrade to CDH 5
Rolling Back a Cloudera Private Cloud Base Upgrade from version 7.1.8 to CDH 6
Rolling Back a Cloudera Private Cloud Base Upgrade from version 7.1.9 to CDH 6
Rolling Back a Cloudera Private Cloud Base Upgrade from versions 7.1.1 - 7.1.7 to CDH 6
Rounding in arithmetic operations
Rounding in arithmetic operations
Rounding in arithmetic operations
Rounding in arithmetic operations
Run Compaction on Hive Tables
Run Hue Document Cleanup
Run Hue Document Cleanup
Run the extraction
Run the extraction for CDH 6 with Cloudera Manager 6
Run the extraction for CDH 6 with Cloudera Manager 7
Run the import
Run the import for CDH 6 with Cloudera Manager 6
Run the import for CDH 6 with Cloudera Manager 7
Run the transformation
Run the transformation for CDH 6 with Cloudera Manager 6
Run the transformation for CDH 6 with Cloudera Manager 7
Running a job interactively
Running a job interactively
Running a job interactively
Running a job interactively
Running a Python-based job
Running a Python-based job
Running a Python-based job
Running a Python-based job
Sample data ingestion
Sample data ingestion
Sample data ingestion
Save Hive Metastore by Dumping
Scheduler migration limitations
Scheduler migration overview
Scheduler transition limitations
Scheduler transition limitations
Search post-HDP-upgrade tasks
Search post-HDP-upgrade tasks
Securing ZooKeeper
Securing ZooKeeper
Security considerations for encrypted data during replication
Security tasks
Security tasks
Semantic changes and workarounds CDP 7.1.1
Semantic changes and workarounds CDP 7.1.4
Semantic changes and workarounds CDP 7.1.5
Semantic changes and workarounds CDP 7.1.6
Semantic changes and workarounds CDP 7.1.7
Semantic changes and workarounds CDP 7.1.7 SP1
Semantic changes and workarounds CDP 7.1.7 SP2
Semantic changes and workarounds CDP 7.1.7 SP2 CHFx
Semantic changes and workarounds CDP 7.1.8 CHFx
Sentry to Ranger
Sentry to Ranger replication for Hive external tables
Service components limitations
Service Monitor Requirements
Set ACLs for Impala
Set ACLs for Impala
Set log level for KeyTrustee KMS to INFO
Set log level for KeyTrustee KMS to INFO
Set maximum retention days for Ranger audits
Set maximum retention days for Ranger audits
Set Solr configuration properties
Set Storage Engine ACLs
Set Up a New Streaming Cluster in Cloudera Private Cloud Base
Set up self-signed certificates
Set up self-signed certificates
Set up trusted CA certificate
Set up trusted CA certificate
Setting Hive Configuration Overrides
Setting Hive Configuration Overrides
Setting Hive Configuration Overrides
Setting Hive Configuration Overrides
Setting Hive Configuration Overrides
Setting Hive Configuration Overrides
Setting the owner and permissions of /user/yarn
Setting up a local repository
Setting up a local repository
Setting up access control lists
Setting up access control lists
Setting up access control lists
Setting up access control lists
Setting up access control lists
Setting up access control lists
Setting up CMA server
Setting up Hive metastore for Atlas
Setting up Hive metastore for Atlas
Setting up Hive metastore for Atlas
Setting up Hive metastore for Atlas
Setting up local repository with temporary internet access
Setting up local repository with temporary internet access
Setting up ODBC connection from a BI tool
Setting up quick links for the DAS UI
Setting up quick links for the DAS UI
Setting up the HDP cluster
Setting up the tmp directory
Setting up the tmp directory
Side-car migration
Sidecar migration from CDH to CDP
Sidecar migration from CDH to Cloudera
Sidecar migration from HDP to CDP
Sidecar migration from HDP to Cloudera
Snapshots and snapshot policies
Software download matrix for 3.1.5 to CDP 7.1.x
Software download matrix for HDP 2.6.5 to CDP 7.1.x
Software download matrix for HDP 3.1.5 and 2.6.5 to CDP 7.1.x
Software Requirements
Solr
Solr
Solr
Solr
Solr
Solr
Solr
Sort behavior in SHOW COLUMNS
Spark
Spark
Spark
Spark
Spark
Spark
Spark
Spark
Spark 1.6 to Spark 2.4 changes
Spark 1.6 to Spark 2.4 changes
Spark 1.6 to Spark 2.4 changes
Spark 1.6 to Spark 2.4 changes
Spark 1.6 to Spark 2.4 Refactoring
Spark 1.6 to Spark 2.4 Refactoring
Spark 1.6 to Spark 2.4 Refactoring
Spark 1.6 to Spark 2.4 Refactoring
Spark 2.3 to Spark 2.4 changes
Spark 2.3 to Spark 2.4 changes
Spark 2.3 to Spark 2.4 changes
Spark 2.3 to Spark 2.4 changes
Spark 2.3 to Spark 2.4 changes
Spark 2.3 to Spark 2.4 Refactoring
Spark 2.3 to Spark 2.4 Refactoring
Spark 2.3 to Spark 2.4 Refactoring
Spark 2.3 to Spark 2.4 Refactoring
Spark 2.3 to Spark 2.4 Refactoring
Spark 2.4 CSV example
Spark 2.4 CSV example
Spark 2.4 CSV example
Spark 2.4 CSV example
Spark 2.4 CSV example
Spark 2.4 CSV example
Spark 2.4 CSV example
Spark 2.4 CSV example
Spark 2.4 CSV example
Spark 2.4 to Spark 3.2 Refactoring
Spark integration with Hive
Spark integration with Hive
Spark integration with Hive
Spark integration with Hive
Spark integration with Hive
Spark2/Livy
Spark2/Livy
Specifying hosts to improve HDFS replication policy performance
Specifying hosts to improve Hive replication policy performance
Start job history
Start job history
Starting all services
Starting all services
STDDEV_SAMP and VAR_SAMP
Step 10: Exit Maintenance Mode
Step 10: Run the Upgrade Cluster Wizard
Step 10: Run the Upgrade Cluster Wizard
Step 11: Finalize the HDFS or Ozone Upgrade
Step 11: Finalize the HDFS or Ozone Upgrade
Step 12: Complete Post-Upgrade steps for upgrades to Cloudera Base on premises
Step 12: Complete Post-Upgrade steps for upgrades to Cloudera Base on premises
Step 13: Exit Maintenance Mode
Step 13: Exit Maintenance Mode
Step 1: Getting Started
Step 1: Getting Started
Step 1: Getting Started
Step 1: Getting Started
Step 1: Getting Started
Step 1: Getting Started
Step 1: Getting Started
Step 1: Getting Started Upgrading Cloudera Manager 5
Step 1: Getting Started Upgrading Cloudera Manager 6
Step 1: Getting Started Upgrading Cloudera Manager 7
Step 2: Backing Up
Step 2: Backing Up
Step 2: Backing Up
Step 2: Backing Up
Step 2: Backing Up
Step 2: Backing Up
Step 2: Backing Up
Step 2: Ingesting permissions into Ranger
Step 2: Review Notes and Warnings
Step 2: Review Notes and Warnings
Step 2: Review Notes and Warnings
Step 3: Backing Up the Cluster
Step 3: Backing Up the Cluster
Step 3: Backing Up the Cluster
Step 3: Before You Upgrade
Step 3: Before You Upgrade
Step 3: Before You Upgrade
Step 3: Before You Upgrade
Step 3: Upgrading the Server
Step 3: Upgrading the Server
Step 3: Upgrading the Server
Step 4: After You Upgrade
Step 4: After You Upgrade
Step 4: After You Upgrade
Step 4: After You Upgrade
Step 4: Back Up
Step 4: Back Up Cloudera Manager
Step 4: Back Up Cloudera Manager
Step 4: Upgrading the Agents
Step 4: Upgrading the Agents
Step 4: Upgrading the Agents
Step 5: Access Parcels
Step 5: After You Upgrade
Step 5: After You Upgrade
Step 5: After You Upgrade
Step 5: Complete Pre-Upgrade steps for upgrades to Cloudera Base on premises
Step 5: Complete Pre-Upgrade steps for upgrades to Cloudera Base on premises
Step 6: Access Parcels
Step 6: Access Parcels
Step 6: Enter Maintenance Mode
Step 7: Configure Streams Messaging Manager
Step 7: Configure Streams Messaging Manager
Step 7: Run the Upgrade Cluster Wizard
Step 8: Configure Schema Registry
Step 8: Configure Schema Registry
Step 8: Finalize the HDFS or Ozone Upgrade
Step 9: Complete Post-Upgrade steps for upgrades to Cloudera Base on premises
Step 9: Enter Maintenance Mode
Step 9: Enter Maintenance Mode
Steps for migrating Impala workloads to Cloudera Data Warehouse Data Service on premises
Stop Cloudera Manager Server & Cloudera Management Service
Stop Cloudera Manager Server & Cloudera Management Service
Stop Cloudera Manager Server & Cloudera Management Service
Stop Cloudera Manager Server & Cloudera Management Service
Stop Cloudera Manager Server & Cloudera Management Service
Stop Cloudera Manager Server & Cloudera Management Service
Supplemental Upgrade Topics
Support for 0 ROWS PRECEDING or FOLLOWING
Support for SQL:2016 datetime formats (limited formats)
Support for SQL:2016 datetime formats (text, FM, FX)
Supported authentication
Supported in-place upgrade paths
Supported scheduled query operations
Syntax and semantic changes CDH 6.2.1 to CDP 7.0.3.2
Table design considerations
Table properties support
Table properties support
Table properties support
Table properties support
Table properties support
Table properties support
Table properties support
Table properties support
Table properties support
Tables in Hive 1 and 2 vs. Hive 3
Take a Mandatory Snapshot of Hive Tables
Taking and deleting HDFS snapshots
Test the configuration
Test the configuration
Test the configuration
Test the replicated data
Testing applications
Tez
Tez
Third-party filesystems
TIMESTAMP based on UTC
Timestamp or date related behaviors
TLS/SSL
TLS/SSL
Topology migration
Topology migration
Transition process
Transition process
Transition Solr configuration
Transition Solr configuration
Transition Solr configuration
Transitioning Cloudera Search configuration
Transitioning Embedded PostgreSQL Database to External PostgreSQL Database
Transitioning from MapReduce 1 to MapReduce 2
Transitioning from Sentry Policy Files to the Sentry Service
Transitioning from Sentry Policy Files to the Sentry Service
Transitioning HDP 2.6.5 cluster to CDP Private Cloud Base 7.1.x cluster using the AM2CM tool
Transitioning HDP 3.1.5 cluster to CDP Private Cloud Base 7.1.x cluster using the AM2CM tool
Transitioning HDP to Cloudera Private Cloud Base
Transitioning HDP to Cloudera Private Cloud Base
Transitioning Navigator audits
Transitioning Navigator audits
Transitioning Navigator content to Atlas
Transitioning Navigator content to Atlas
Transitioning Navigator data using customized scripts
Transitioning Navigator data using customized scripts
Transitioning the Sentry service to Apache Ranger
Transitioning the Sentry service to Apache Ranger
Transitioning to Cloudera Manager
Transitioning to Cloudera Manager
Troubleshooting
Troubleshooting
Troubleshooting
Troubleshooting
Troubleshooting
Troubleshooting
Troubleshooting
Troubleshooting CDSW migration to Cloudera AI
Troubleshooting DAS installation
Troubleshooting DAS installation
Troubleshooting Hive ACID table replication policies
Troubleshooting Hive replication using REPL
Troubleshooting preflight migration check issues
Troubleshooting replication policies between on-premises clusters
Troubleshooting snapshot policies in Replication Manager
Troubleshooting the HDP upgrade
Troubleshooting the HDP upgrade
Troubleshooting: Retrying a successful or failed migration
Troubleshooting: Running the tool as a background process
Troubleshooting: Using custom SSL certificates to connect to Cloudera AI Workbenches
TRUNCATE TABLE on an external table
TRUNCATE TABLE on an external table
TRUNCATE TABLE on an external table
TRUNCATE TABLE on an external table
TRUNCATE TABLE on an external table
TRUNCATE TABLE on an external table
Tuning JVM Garbage Collection
Tuning JVM Garbage Collection
Tuning JVM Garbage Collection
Turn off YARN GPU
UNBOUNDED representation in Window functions
Understanding CREATE TABLE behavior
Understanding CREATE TABLE behavior
Understanding CREATE TABLE behavior
Understanding CREATE TABLE behavior
Understanding CREATE TABLE behavior
Understanding CREATE TABLE behavior
Understanding migrated Navigator entites
Understanding the Hive upgrade
Understanding the Hive upgrade
Understanding the Hive upgrade
union replaces unionAll
union replaces unionAll
union replaces unionAll
union replaces unionAll
UNIX_TIMESTAMP behavior
UNIX_TIMESTAMP conversion of TIMESTAMPLOCALTZ
Unsetting Kafka Protocol version
Unsetting Kafka Protocol version
Unsupported Interfaces and Features
Update Oozie properties
Update Oozie properties
Update permissions for Replication Manager service
Update permissions for Replication Manager service
Update Ranger passwords
Update version repository base urls
Update version repository base urls
Updated CDH Components
Updated HDP Components
Updating Ambari repo files
Updating Ambari repo files
Updating group permissions for Hive query editor
Updating group permissions for Hive query editor
Updating HDP repo files
Updating HDP repo files
Updating Hive and Impala JDBC/ODBC drivers
Updating Hive and Impala JDBC/ODBC drivers
Updating Hive and Impala JDBC/ODBC drivers
Updating Hive and Impala JDBC/ODBC drivers
Updating Navigator Encrypt
Upgrade Key Trustee Server to 7.1.x
Upgrade Key Trustee Server to 7.1.x
Upgrade Navigator Encrypt to 7.1.x
Upgrade Navigator Encrypt to 7.1.x
Upgrade Notes for Apache Kudu 1.12 / CDP 7.1
Upgrade Notes for Apache Kudu 1.15 / CDP 7.1
Upgrade process
Upgrade process
Upgrade to Ambari 7.1.x.0
Upgrade to Ambari 7.1.x.0
Upgrading a CDH 5 Cluster
Upgrading a CDH 6 Cluster
Upgrading a Cluster
Upgrading Ambari
Upgrading Ambari
Upgrading Ambari Infra
Upgrading Ambari Infra
Upgrading Ambari Log Search
Upgrading Ambari Metrics
Upgrading Ambari Metrics
Upgrading Ambari Metrics System and SmartSense
Upgrading an MRv1 installation using Cloudera Manager
Upgrading Cloudera Manager 5
Upgrading Cloudera Manager 6
Upgrading Cloudera Manager 7
Upgrading Cloudera Navigator Encrypt
Upgrading Cloudera Navigator Key HSM
Upgrading Cloudera Navigator Key HSM
Upgrading Cloudera Navigator Key HSM
Upgrading Cloudera Navigator Key Trustee Server 7.1.x
Upgrading HDP to Cloudera Runtime 7.1.x
Upgrading HDP to Cloudera Runtime 7.1.x
Upgrading Key Trustee KMS
Upgrading SmartSense
Upgrading the cluster's underlying OS
Upgrading the cluster's underlying OS
Upgrading the cluster's underlying OS
Upgrading the JDK
Upgrading the JDK
Upgrading the JDK
Upgrading the Operating System
Upgrading the Operating System
Upgrading the Operating System to a new Major Version
Upgrading the Operating System to a new Minor Version
Upload aliases.json to the upgraded cluster
Upload HDFS entity information
Upload HDFS entity information
Use authzmigrator tool to migrate Hive and Kafka permissions to Ranger
Use Cloudera Manager Safety Valves to configure scheduler properties
Use Cloudera Manager Safety Valves to configure scheduler properties
Use Replication Manager to migrate to Cloudera Base on premises
Use the fs2cs conversion utility
Use the fs2cs conversion utility
Use YARN Queue Manager UI to configure scheduler properties
Use YARN Queue Manager UI to configure scheduler properties
Using AES-256 Encryption
Using AES-256 Encryption
Using AES-256 Encryption
Using Airflow
Using Cloudera Manager Safety Valves to configure scheduler properties
Using Reserved Words in SQL Queries
Using spark-submit drop-in migration tool for migrating Spark workloads to CDE
Using Swagger Page
Using the CDSW to Cloudera AI Migration tool
Using the Cloudera Data Engineering CLI
Using YARN Queue Manager UI to configure scheduler properties
Validate Database URL
Validate Database URL
Validate HFile
Validate HFiles
Validate the configuration
Validate the configuration
Validate the configuration
Validate the migration
Validate TLS configurations
Validate TLS configurations
Validating external table replication
Validating the migrated data
Verify Zeppelin settings in Ambari
Verifying and validating if your data is migrated
Verifying replication
Verifying the Hive data replication
Verifying the migration prerequisites
Version and Download Information
Versions and supported services for migration
View HDFS replication policy details
View historical details for an HDFS replication policy
What is new in Atlas for Navigator Users
What's new in Atlas for Navigator Users?
What's new in Atlas for Navigator Users?
Workload selection
Write to Hive bucketed tables
Write to Hive bucketed tables
Write to Hive bucketed tables
Write to Hive bucketed tables
YARN
YARN
YARN
YARN
YARN
YARN
YARN Capacity Scheduler
YARN CGroups
YARN Fair Scheduler to Capacity Scheduler
Yarn Mapreduce framework jars
Yarn Mapreduce framework jars
YARN mapreduce paramater
YARN NodeManager
YARN NodeManager CGroups
YARN owner permission
YARN Registry DNS instance fails to start
YARN Registry DNS instance fails to start
You cannot see your databases or the query editor is missing
You cannot see your databases or the query editor is missing
You cannot view new databases and tables, or cannot see changes to existing databases or tables
You cannot view new databases and tables, or cannot see changes to existing databases or tables
You cannot view queries from other users
You cannot view queries from other users
Your queries are not appearing on the Queries page
Your queries are not appearing on the Queries page
ZDU Component Support
ZDU known issues
Zeppelin
Zeppelin
Zeppelin
Zeppelin Shiro configurations
Zeppelin Shiro configurations
ZooKeeper
ZooKeeper
ZooKeeper
ZooKeeper
ZooKeeper
ZooKeeper
Mandatory Post-Upgrade Tasks
Mandatory Post-Upgrade Tasks
Preparing to Upgrade Ambari
«
Filter topics
Tez
▶︎
HDP to CDP Upgrade Overview
In-Place Upgrade Overview
CDP Upgrade Readiness
How much time should I plan for to complete my upgrade?
▶︎
Cluster environment readiness
Disk space and mountpoint considerations
Downloading and Publishing the Package Repository
Downloading and Publishing the Parcel Repository
Hadoop Users (user:group) and Kerberos Principals
Sample data ingestion
Merge Independent Hive and Spark Catalogs
▶︎
Ambari and HDP Upgrade Checklist
Ambari upgrade checklist
Download cluster blueprints without hosts
HDP upgrade checklist
Checklist for large clusters
Before upgrading any cluster
Managing MPacks
Changes to Ambari services and views
HDP Core component version changes
▶︎
Upgrading the cluster's underlying OS
In-Place and Restore
Move and Decommission
▶︎
Upgrading Ambari
Before you upgrade Ambari
Backup Ambari
▶︎
Setting up a local repository
Updating Ambari repo files
Updating HDP repo files
Case study for setting up an HDP-GPL local repository
Setting up local repository with temporary internet access
Case study for setting up local repository
Update version repository base urls
Preparing Ambari Repository Configuration File to use Local Repository
Preparing to Upgrade Ambari
Upgrade to Ambari 7.1.x.0
Download cluster blueprints
▶︎
Mandatory Post-Upgrade Tasks
Upgrading Ambari Infra
Upgrading Ambari Log Search
Upgrading Ambari Metrics
▶︎
Upgrading HDP to Cloudera Runtime 7.1.x
▶︎
HDP Prerequisites
Upgrade process
Before upgrading any cluster
▶︎
Backup HDP Cluster
Backup and Restore Databases
▶︎
Backup Ranger
Backup Ranger Admin Database
Backup Ranger KMS Database
▶︎
Backup Atlas
Backup HBase tables
Backup Ambari Infra Solr
Backup Ambari-Metrics
Backup Hive
Backup HBase
Backup Kafka
Backup Oozie
Backup Knox
Backup Logsearch
Backup Zeppelin
Backup HDFS
Backup ZooKeeper
▶︎
Backup databases
▶︎
Before you upgrade
Checkpoint HDFS
▶︎
Pre-upgrade steps
Ranger Service connection with Oracle database
Ranger admin password
Preparing Spark for upgrade
▶︎
Backing up Ambari infra Solr data
▶︎
Back up and upgrade Ambari infra Solr and Ambari Log Search
Generate migration configuration
Back up Ambari Infra Solr data
Remove Existing Collections and Upgrade Binaries
Preparing HBase for upgrade
Preparing the backend HMS database for upgrade
Turn off YARN GPU
▶︎
Preparing HDP Search for upgrade
Before you begin
Download Solr configuration from HDP Search ZooKeeper
▶︎
Transition Solr configuration
Cloudera Manager versions 7.1.1 to 7.2.4
Cloudera Manager versions 7.3.1 or higher
Validate the configuration
Test the configuration
Preparing ZooKeeper for upgrade
▶︎
Preparing Kafka for upgrade
Extract Kafka broker ID
▶︎
Register software repositories
HDP Intermediate bits for 7.1.x.0 Repositories
▶︎
Software download matrix for 3.1.5 to CDP 7.1.x
AM2CM legacy tools download
Install software on the hosts
▶︎
Perform the HDP upgrade
Perform express upgrade
▶︎
Post-HDP-upgrade tasks
Upload HDFS entity information
Ambari infra-migrate and restore
Ambari Metrics and LogSearch
Back up the Ranger configuration
Backup Infra Solr collections
▶︎
Troubleshooting the HDP upgrade
YARN Registry DNS instance fails to start
HDP 3.1.5 to HDP 7.1.7 Intermediate bits Kafka upgrade
Rollback Ambari 7.1.x to Ambari 2.7.5
▶︎
Rollback HDP Services 3.1.5 from CDP 7.1.x
Overview
ZooKeeper
Ambari-Metrics
Ambari Infra Solr
▶︎
Ranger
Restore Ranger Admin Database
Restore Ranger KMS Database
HDFS
YARN
HBase
Kafka
▶︎
Atlas
Restore HBase Tables
Restore ATLAS_ENTITY_AUDIT_EVENTS table
Restore Solr snapshots
Hive
Spark
Oozie
Knox
Zeppelin
Log Search
▼
Transitioning to Cloudera Manager
▶︎
Pre-transition steps
Databases
▶︎
Kerberos
Kerberos principal
▶︎
HDFS
Preparing HDFS
Backup the non-default Rack Awareness Topology script
▶︎
Spark
Spark2/Livy
Ranger
Solr
▶︎
Cloudera Manager Installation and Setup
Installing JDBC Driver
Proxy Cloudera Manager through Apache Knox
▶︎
Transitioning HDP to Cloudera Private Cloud Base
Transitioning HDP 3.1.5 cluster to CDP Private Cloud Base 7.1.x cluster using the AM2CM tool
▼
Post transition steps
Generating keytabs in Cloudera Manager
Enable Auto Start setting
ZooKeeper
Delete ZNodes
Ranger
Ranger KMS
Add Ranger policies for components on the CDP Cluster
Set maximum retention days for Ranger audits
Search post-HDP-upgrade tasks
▶︎
HDFS
Ports
TLS/SSL
HDFS HA
Custom Topology
Add Balancer Role to HDFS
Other review configurations for HDFS
Configuring HDFS properties to optimize log collection
▶︎
Solr
Restore Solr collections on CDP cluster
▶︎
Kafka
Change Kafka port value
Unsetting Kafka Protocol version
Impala
▶︎
YARN
Start job history
Yarn Mapreduce framework jars
GPU Scheduling
YARN CGroups
Reset ZNode ACLs
▶︎
Placement rules evaluation engine
Converting old mapping rule format to JSON-based placement rule format
Setting the owner and permissions of /user/yarn
▶︎
Spark
Livy2
Tez
▶︎
Hive
Identifying and fixing invalid Hive schema versions
Create HIVE sys database
Setting up Hive metastore for Atlas
HMS health check
HBase
▶︎
Hue
▶︎
Installing Python 3.8
Installing Python 3.8 on CentOS 7 for Hue
Installing Python 3.8 on RHEL 8 for Hue
Installing Python 3.8 on SLES 12 for Hue
Installing Python 3.8 on Ubuntu 18 for Hue
Installing the psycopg2 Python package for PostgreSQL database
Installing MySQL client for MySQL databases
Installing MySQL client for MariaDB databases
Ozone
▶︎
Oozie
Validate Database URL
Installing the new Shared Libraries
Update Oozie properties
Adding Oozie service dependencies
Access Oozie load balancer URL
Oozie Load Balancer configuration
Atlas advanced configuration snippet (Safety valve)
Migrating Atlas data
▶︎
Phoenix
Map Phoenix schemas to HBase namespaces
Starting all services
▶︎
Knox
Topology migration
Migrate Credential Aliases
Migrate signing key
Configure Apache Knox authentication for AD/LDAP
Client Configurations
Securing ZooKeeper
Zeppelin Shiro configurations
▶︎
Migrating Spark workloads to
▶︎
Spark 1.6 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 1.6 to Spark 2.4 changes
New Spark entry point SparkSession
Dataframe API registerTempTable deprecated
union replaces unionAll
Empty schema not supported
Referencing a corrupt JSON/CSV record
Dataset and DataFrame API explode deprecated
CSV header and schema match
Table properties support
CREATE OR REPLACE VIEW and ALTER VIEW not supported
Managed table location
Write to Hive bucketed tables
Rounding in arithmetic operations
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
▶︎
Compiling and running Spark workloads
Compiling and running a Java-based job
Compiling and running a Scala-based job
Running a Python-based job
Running a job interactively
Post-migration tasks
▶︎
Spark 2.3 to Spark 2.4 Refactoring
Handling prerequisites
▶︎
Spark 2.3 to Spark 2.4 changes
Empty schema not supported
CSV header and schema match
Table properties support
Managed table location
Precedence of set operations
HAVING without GROUP BY
CSV bad record handling
Spark 2.4 CSV example
Configuring storage locations
Querying Hive managed tables from Spark
Compiling and running Spark workloads
Post-migration tasks
▶︎
Apache Hive Changes in CDP
Hive Configuration Property Changes
Customizing critical Hive configurations
Setting Hive Configuration Overrides
Hive Configuration Requirements and Recommendations
Removing the LLAP Queue
Configuring HiveServer for ETL using YARN queues
Configuring authorization to tables
▶︎
Updating Hive and Impala JDBC/ODBC drivers
Getting the JDBC driver
Getting the ODBC driver
Setting up access control lists
Configure encryption zone security
Renaming tables
Configure edge nodes as gateways
Configure HiveServer HTTP mode
Configuring HMS for high availability
Installing Hive on Tez and adding a HiveServer role
▶︎
Handling table reference syntax
Add Backticks to Table References
Unsupported Interfaces and Features
Changes to HDP Hive tables
Configuring External Authentication for Cloudera Manager
▶︎
Additional Services
▶︎
Installing DAS using Ambari
Check cluster configuration for Hive and Tez
Add the DAS service
▶︎
DAS post-installation tasks
Additional configuration tasks
Setting up the tmp directory
▶︎
Configuring DAS for SSL/TLS
Set up trusted CA certificate
Set up self-signed certificates
Configure SSL/TLS in Ambari
▶︎
Configuring user authentication in Ambari
Configuring user authentication using Knox SSO
Configuring user authentication using Knox proxy
Configuring user authentication using SPNEGO
Enabling logout option for secure clusters
▶︎
Troubleshooting DAS installation
▶︎
Problem area: Queries page
Your queries are not appearing on the Queries page
Query column is empty, yet you can see the DAG ID and Application ID
Query column is not empty, but you cannot see the DAG ID and Application ID
You cannot view queries from other users
▶︎
Problem area: Compose page
You cannot see your databases or the query editor is missing
▶︎
You cannot view new databases and tables, or cannot see changes to existing databases or tables
Replication failure in the DAS Event Processor
Problem area: Reports page
DAS service installation fails with the "python files missing" message
DAS does not log me out as expected, or I stay logged in longer than the time specified in the Ambari configuration
▶︎
Getting a 401 - Unauthorized access error message while accessing DAS
Setting up quick links for the DAS UI
Installing DAS using Cloudera Manager
▶︎
Adding Hue service with Cloudera Manager
Install and configure MySQL database
Add the Hue service using Cloudera Manager
Enable Kerberos for authentication
Integrate Hue with Knox
Grant Ranger permissions to new users or groups
Adding Query Processor service to a cluster
Applications Upgrade
Procedure to Rollback from CDP Private Cloud Base 7.1.7 SP1 to CDP Private Cloud Base 7.1.7
»
HDP3 to CDP Private Cloud Base Two Stage upgrade
Tez
Install the Tez tar files on HDFS.
From the Tez cluster service, install Tez tar files on HDFS used by Hive use and then deploy Client Configuration.
Run Upload Tez tar file to HDFS
Deploy Client Configuration.
Parent topic:
Post transition steps